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THE UNIQUENESS OF CONFIGURAL TEST ITEM SCORES! 
PAUL HORST 
University of Washington 


PROBLEM 


In dealing with the responses of clients to test and personality items, clinical 
psychologists have contended for years that it is not enough to consider merely the 
sum, or even the weighted sum, of dichotomous scores on the individual items. 
Zubin“) was among the first to consider the pattern of responses to personality in- 
ventory items as a basis for more accurate diagnosis and prognosis. Since that time 
a great deal of interest has developed in patterns or configurations of responses to 
sets of stimulus elements. This interest has expressed itself in various forms. The 
most general expression is to the effect that the interrelationships of a person’s 
traits, characteristics, and responses must be analyzed before the client can be proper- 
ly understood; or to put it even more vaguely, the whole is not equal to the sum of 
the parts. A rather extensive discussion of the literature on configurational and pat- 
tern analysis has been given by Gaier and Lee®?. 

There has been considerable confusion between pattern analysis and profile 
analysis, and it is not always clear what investigators mean when they use these 
terms. However, much of the interest has concerned itself with categorical responses 
to various types of personality, interest, and attitude inventories. Meeh]“ has con- 
sidered the case where, for each of two dichotomously scored items, the phi-coefficient 
with a dichotomous criterion is zero. He shows, however, that if a score of ‘1’ is 
assigned when the items are both answered either true or false and a score of zero 
when either one is answered true and the other false then it is possible, by using these 
derived or configural scores, to have appreciable correlation with the criterion. This 
phenomenon has come to be known as Meehl’s paradox. Horst“) has shown math- 
ematically that this apparent paradox is a perfectly logical mathematical phenom- 
enon representing but a special case of the general regression equation, 


T = bo + bixi + boxe + bioxixe , (1) 


where the b’s are the regression coefficients and the z’s are the scores. This equation 
is a linear function not only of the two variables but also of their product. 

In a recent article Fricke ® suggests that Meehl and Horst’s configural analysis 
through phi-coefficients may often lead to incorrect inferences and therefore should 
not be used. This is true, however, only for the very special case considered by 
Meehl. Equation (1) is applicable to any variables and to dichotomous variables in 
particular. It is, however, more convenient to use equation (1) in terms of propor- 
tions rather than ¢ coefficients. 

If equation (1) is used as the basic regression equation for the case of two dicho- 
tomously scored items it is easy by conventional methods to determine not only the 
best weights to use in predicting a dichotomous criterion but also what we might call 
the multiple configural phi-coefficient, since all variables are dichotomous. We let 


Pp: be the proportion of the total group who answered item 1 true, 

P2 be the proportion who answered item 2 true, 

py be the proportion in the high criterion group, 

Pir the proportion who answered both items 1 and 2 true, 

Piy the proportion in the high criterion group who answered item 1 true, 

Px, the proportion in the high criterion group who answered item 2 true, 

Puy the proportion in the high criterion group who answered both 1 and 2 true. 


1This research was carried out under Contract M-743 MH (1) between the University of Washing- 
ton and the Public Health Service, National Institutes of Health. 
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We make up a matrix, 
1 pr P2 Pu 
= Pi Pi Piz Piz 
P2 Piz Pz Piz , 
Piz Piz Piz Pir 
a vector of proportions involving the criterion, 
Py’ = (Py Pay Pay Psy) ’ 
and a vector of regression weights, 
b’ = (bo bi be bis) 
Then the vector of regression weights will be given by, 
= ¢" Py , 
and the multiple phi-coefficient will be, 


in py'b 1 

: Re \ + Dyay a 

To use the example taken from Fricke’s ° Table 5 we have, 
Pi = -500, P: = -500, Py = .500, Pr = 400, 
Py = .225, Pyz = .275, Py = -200. 





The high criterion group is arbitrarily taken as normal and the low as neurotic, 
although it makes no difference in the solution which is taken as high and which as 
low. From these data we have then, 
1.000 .500 .500 .400 
oa .500 .500 .400 .400 
.500 .400 .500 .400 
.400 .400 .400 .400 


pa = (.500 .225 . - 200) 


Then from (5), 
b’ = (.500 —.250 . .000) 


Rg = .224 


An even simpler method based on the work of Lubin and Osburn is available 
for calculating the vector of regression coefficients. If only the multiple ¢ coefficient 
is desired, Alf“) has proved that it may be calculated somewhat more simply. Sup- 
pose there are K dichotomously scored items. Let S, be the total number of persons 
with answer pattern 7, and Y, the total number of persons in the high criterion group 
with answer pattern t. Then the multiple Rg is given by, 


ox Y? 1 
Ro=1+2—-— . (6a) 
1 8; dy 


Npyqy 


From Fricke’s tables we have, 
Si = 80, Sp = 80, Sy = 20, S, = 
Yxy = 40, Ye = 40, Yu = 5, Yr = 
N = 200, p, = .5, qy = .5. 


Substituting these values in formula (6a) gives again, 


Rg = .224 


and from (6), 


20, 
15, 
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Actually, as we pointed out some years ago», it is a straightforward matter to 
write down a polynomial of any degree in any number of predictor variables and to 
solve for the coefficients by conventional methods. The most general polynomial 
regression equation takes the form 


¥ = bo + Thx, + Thyx x; + Thy xxx, +... ; (7) 


In the case of dichotomous variables equation (7) simplifies considerably with 
experimental data and also in mathematical manipulation. Much of the simplifica- 
tion is due to the fact that any order product terms or powers of dichotomous 
measures are always either “0” or ‘‘1”. Lubin and Osburn“ have done some im- 
portant theoretical work in the use of higher order multiple polynomials in the con- 
figural analysis of dichotomous variables which has not yet had wide circulation. 
In any case it is becoming increasingly obvious that most of what has come to be 
considered configural or pattern analysis can be appropriately handled in terms of 
higher order polynomials. It appears that the notion of patterns or configurations 
or dynamical interrelationships in the study of personality involves little if anything 
more than the concept of non-linear as distinguished from linear functions of the 
variables under consideration. 

However, as we indicated some years ago one of the chief disadvantages to 
the use of multiple polynomial functions in the study of human behavior is that the 
general function, even though enormously flexible, is ravenous in consuming degrees 
of freedom. If we have n items scored dichotomously then there will be 2" possible 
patterns of responses and the same number of regression coefficients to be determined 
if we consider each possible distinct product term including the zero’th order or 
constant term. If we have only 10 items the total number of regression coefficients 
which will include all possible configurations will be 1024. To determine these para- 
meters with any degree of reliability may require many thousands of cases. In the 
reference previously cited“ we suggested that one of the approaches to solving the 
problem of too many parameters might be to consider a matrix consisting not only 
of the original scores but successively augmented to include all possible power and 
product terms as separate variables. The dimensionality of the matrix could then 
be investigated by factor techniques or other methods to see whether most of its 
variance could be accounted for in terms of a matrix of much lower rank. 

In other words, before we worry too much about the possible importance of 
higher order interrelationships for predicting a criterion let us see whether these 
actually do provide information of any kind in addition to that provided by lower 
order terms. If it is possible to express higher order product terms as linear functions 
of lower order terms then it is easy to prove that these higher order terms will add 
nothing to the predictive efficiency of lower order terms. 

For example, suppose we have 3 items dichotomously scored. We also have a 
score for each pair of items which is the product of the scores on the two items as 
indicated in Table 1. Suppose now we can find a set of weights to apply to the 
scores in the first part of the table which would give us the first column in the second 
part. The formula 

1- 1x; + 1x. + Ox; 


does give us the scores for 2:22. In the same way the formula 
1 — 1x; + Ox, + 1xs 


gives the scores for 2; x3 as well as for 2 zs. All the information in the item pair 
scores is a linear function of the individual item responses and would add nothing 
to the prediction of an external criterion over that provided by the single item res- 
onses. 
° Since the labor involved in examining the predictive efficiency of higher order 
product terms or configurations of items is so great, it would seem well, therefore, to 
have some preliminary techniques for investigating the extent to which higher order 
product terms do add to the variance of the predictor system. If they add nothing 
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TABLE l. TABLE 2. 
Matrix oF A CONSTANT Matrix or SEconD Tue t Matrix 
TERM AND SCORES ON Orver Cross Propucts 
Turee Items VV; V3 Vi Vs Ve 


C 1 12 13 























then we may ignore them. If they do have unique variance, then this might or might 
not be valid for a particular criterion. Ideally we should like a feasible routine for 
determining whether, from among all possible product terms, a relatively small num- 
ber could be found in terms of which all the others could be expressed as linear func- 
tions. 

We shall, for the present, content ourselves with a much more modest investiga- 
tion. We begin by considering the N x m matrix, X, of zero-one scores for N persons 
on m items. Now if each second order product vector of scores could be expressed as 
a linear function of the m items, then the sum of these product vectors could also be 
expressed as a linear function of the m items. Suppose for example we consider a 
matrix Y2 of all possible distinct second order product scores. This matrix will have 
m(m — 1) columns. By assumption an N x m(m-— 1) matrix, Bz, exists such that 

2 2 
XB=Y; . (8) 
Now if (8) is true then we will also have 


XBel = Yel ’ (9) 


where 1 is a column vector of 1’s which sums the rows of B and Y respectively. Sup- 
pose now we let 


Bel = be : (10) 
and 


Yl=V: . (11) 
Then we can write from (4) 

Xbe = V2 ; (12) 

Now if (12) holds approximately we have no certain assurance that (8) will hold. 

That is, even though we can predict the sum of all second order products this does 


not guarantee that the scores for each of the product terms can be predicted. We 
may for example have instead of (8) 


XB: = Y2+ « : (13) 
where ¢ is a matrix of residuals. It may well be that 
el =0 P (14) 


or very nearly so. In this case (12) would still hold. Nevertheless, if (12) does not 
hold then we know that (8) cannot hold and that there is information in the second 
order terms not provided by X. In the same way we can let Y; be a matrix of all 
distinct third order product scores and define B; accordingly so that we assume 


XB; = Y3 . (15) 
If (15) holds then 

XB31 = Y3l 3 (16) 
or using notation corresponding to (12), 

Xb=V, . (17) 
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In this paper we shall be content with determining whether the sum of higher 
order product terms can be expressed as linear functions of the dichotomous item 
scores. We let V, be a vector of the sum of all possible distinct product terms in- 
volving 7 of the items. Alf“? has shown that this sum for any individual j is precisely 
the binomial coefficient 


bet” 
Vu = @= pri (18) 


where S; is the person’s score on all the items, i.e., the number of items marked 
“true.” Thus it is seen that the sum of the 7’th order product scores for a person is 
a function only of the number of items and is independent of the particular configura- 
tion. One can then readily write down the V,, terms for any particular S and i, as 
illustrated in Table 2. 

If now for a particular X matrix of m columns of dichotomous item scores we 
should find that the multiple correlation between the item scores and the Vy were 
appreciably less than unity we could be sure that certain of the product terms of 
order i did have unique variance which might contribute independently to the pre- 
diction of an external criterion. This correlation can be calculated for all V, from 2 
to m, where m is the total number of items. Ordinarily we shall have many more 
cases than distinct scores. The highest possible score will be m. 

First we make up an N x S matrix, L, which will have a single “1” in each row 
and the remaining elements all zero. The number of columns S will be the number of 
the highest score. The 7’th column corresponds to a score of 7. The “1” in the j’th 
row is in the column corresponding to the j’th person’s score. We let ¢, be the 7’th 
column of a matrix like Table 2. Then the vector of 7’th degree product sums can 
be seen to be, 

V; ™ Lt, . (19) 
We have then 
Xb, = Lt, = © ’ (20) 


where b, is to be determined so as to minimize the sum of the squares of the e’s or 
tr e’e = minimum . (21) 
It can readily be shown that the solution for b, is 


I~Ny’ TON? 
bi = (Nxx — sxx r (Nx1) — — Bis (22) 
where 


Nx-x is a matrix whose 7j’th element is the number of persons 
marking both items 7 and j true, 


Nx _ isa column vector whose z’th element is the number marking 
item 2 true, 


Nxt is a matrix whose 7j’th element is the number marking item 
i true, who also get a total score of j. 
The solution for the multiple correlation can be shown to be 
NxNy,’ 
bi (Nx1 — —*X P) ti 


2 ’ 
Noy 


1 


R? = 





where co,” is the variance of the elements in V;. 
1 


It is fairly simple computationally, if the number of items is not over 20 or 30, 
to carry the calculations along for all t, vectors which are to be tested. Thus we let ¢ 
be a matrix whose 7’th column is ¢, and b a matrix whose 7’th column is b;. We also let 
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NxNyx’ 
Nxx — ——- (24) 


NxN,’ 
Nx. - j= (25) 


nxit = T ‘ (26) 
From (23), (24), (25) and (26) 

b= nxx"! = ° (27) 
If we let 7; be the 7’th column of 7 then from (23) and (26) 


b,’T, 
R? = Negi ; (28) 


EXAMPLE 


The method will be illustrated by a numerical example. The example is based 
on six items from the Minnesota Multiphasic Personality Inventory and a sample of 
97 cases. The X matrix is simply the item score matrix in which the score for an 
item is “1” if marked true and “0” if false. 


The Nx-x matrix is given in Table 3. 
The Nx’ and N1’ vectors are given in Table 4. 


The nxx matrix is given in Table 5. It is obtained by subtracting from the 
element in the 7’th row and j’th column of Table 1 the product, divided by N, of the 
#’th and j’th elements from row Nx’ of Table 4. 


The Nx, matrix is given in Table 6. 


TABLE 3 TABLE 4 TABLE 6 
Ny AND Ny 





123 4 5 6 


79 53 52 36 13 
19 36 28 12 1 














Tue n,,, MATRIX 





3 4 5 


—.2680 -6804 —1.1443 
—.1649 —1.3505 —1.3196 
24.0412 —1.4124 1.3299 
—1.4124 24 . 1237 2.7010 
1.3299 2.7010 22 .6392 
—.1031 1.0309 —.8247 








. THE n,,, Matrix 





4 


-4433 
-1959 
2.7010 
6 .9897 
5.6082 
2.2474 
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TasiLeE 8. Tue T Matrix 
Vs Vi 


8.8454 4.3093 
29.4433 14.1134 
73 .6392 36 .7216 
85 .6082 41.7835 
76.1134 37 .7732 
37.4021 22.1959 











TaBLe 9. Tue n,,.-! Matrix 





1 3 4 


.2155 ? , 
-0002 -06 -0005 -0031 
-0013 ; -0419 .0027 
—.0077 ; ; 
-0120 . - .0027 ‘ 
.0079 : -0000 -0048 








Tasie 10. Tue b Matrix TaBLe 11. Tue t®) Matrix 
V: Vs Va Ve Vs Vs 


1.2861 .2988  .0225 0 
1.5061 .3621 .0257 1 
1.5621 .3451 .0187 9 
1.5955 .3091 .0142 36 
1.6229 .3708  .0307 100 
2.2080 .6469 .0823 




















TaBLE 12, THe CoMPUTATIONS FOR No,? 





Vs Vs Va 


288 103 1 

553 1 

82944 10609 1 
. 18556 855 .092783 109 .371134 3.34020618 .0103092783 
869.81444 1228.907217 443 628866  44.65979382  .9896907217 








TasLe 13. Tae Frvat Computations For R, 





V2 Vs Va Vs Ve 


836 .51793869 1013.90374646 261.13744156 14.23198572 -11157740 
.9617 -8250 .5886 3187 


‘ -1127 
-98 -91 77 -56 34 





1 
2 
3 





The nx1, matrix is given in Table 7. It is obtained by subtracting from the 
element in the 7’th row and j’th column of Table 6 the product divided by N of the 
t’th element of row Nx’ in Table 4 by the j’th element in row Ni’. 


The 7 matrix is given in Table 8. It is obtained by premultiplying the ¢ matrix 
in Table 2 by the nx, matrix in Table 7. That is, the element in the 7’th row and 
jth column of Table 8 is the sum of products of corresponding elements from the 
#’th row of Table 7 and the 7’th column of Table 2. 


The matrix in Table 9 is the inverse of the nx-x matrix in Table 5. 


The b matrix in Table 10 is the 7 matrix in Table 8 premultiplied by the nxx" 
matrix in Table 9. 
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To calculate N oy, which will be needed in computing the multiple correlation 


given by (23), we first prepare the matrix ¢® in Table 11. Each element in this table 
is simply the square of the corresponding element in Table 2. 


Table 12 gives the computation for N oy? and R,. The z’th element in row 1 is the 


sum of the products of corresponding elements of row Nx’ in Table 4 and column i of 
Table 2. The 7’th element of row 2 is the sum of the products of corresponding ele- 
ments from N_’ of Table 4 and column i of Table 11. The z’th element of row 3 is 
obtained by squaring the 7’th element of row 1. The 7’th element of row 4 is obtained 


by multiplying the 7’th element of row 3 by a The 7’th element in the fifth row is 


obtained by subtracting the 7’th element in the fourth row from the 7’th element in 
the second row. This fifth row is Noy. 


Table 13 gives the final computations for the multiple correlations. The 7’th 
element in the first row is the sum of products of corresponding elements from the 
#’th column of the JT matrix in Table 8 and the 7’th column of the b matrix in Table 
10. The 7’th element in the second row is the 7’th element in the first row divided by 
the 7’th element in the fifth row of Table 12. This is the square of the multiple cor- 
relation. The 2’th element in the third row is the square root of the 7’th element in 
the second row and is the multiple correlation itself. 

It is seen that the multiple correlation between the item scores and the sum of 
second order products is .98 which is very high and would indicate that in general, 
for these data, configural scores from the items considered two at a time would add 
little information of any kind to that provided by the single item responses. Since 
only one person is involved in the 6th order product it is questionable whether the 
low multiple of .34 has much significance so far as the 6th order product is con- 
cerned. However, it may well be that the 4th and 5th order product terms may 
contribute useful valid variance. 


SUMMARY 


Much further work both experimental and theoretical is still required as a basis 
for investigating the unique variance of higher order configurations. The brief and 
very limited data investigated in this study would lend quantitative support to the 
notion implicit in the contentions of clinical psychologists that higher order con- 
figurations of responses may indeed provide unique and perhaps important informa- 
tion about a client. 
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STATISTICS FOR THE INVESTIGATION OF INDIVIDUAL CASES* 
R. W. PAYNE AND H. GWYNNE JONES 
Institute of Psychiatry, University of London, Maudsley Hospital 


PROBLEM 
Much of the work of a clinical psychologist consists of making relatively routine 
psychological measurements of fairly well established traits, either cognitive or 
orectic. It is well known, however, that there can be no measurement without error. 
The psychologist must have the means of taking error into account if he is to assess 
his test scores intelligently. There appear to be three main types of question which 
face clinical psychologists: 


1. The Abnormality of a Discrepancy between Two Scores 


This problem arises every time a psychologist gives more than one measure. 
Perhaps the commonest example is the Wechsler-Bellevue Intelligence Scale. This 
test provides two rather different measures of intelligence, the “‘Verbal Scale IQ” 
and the ‘‘Performance Scale IQ”. It is a common experience that these two scores 
are divergent. In fact the discrepancy may suggest interesting hypotheses in line 
with other abnormalities the patient shows. However, before we can assess such a 
discrepancy, we must take into account two factors. We know that neither scale is 
perfectly reliable and we also know that the scales are not perfectly correlated. 
Therefore, many normal people would show discrepancies between the two scales 
which one need not take seriously. The first question we can ask ourselves then, is 
how frequently would a discrepancy as large as the one we observe occur in the norm- 


al population? That is, how “abnormal” is the difference we observe between our 
test scores? 


2. The reliability of a discrepancy between two scores 


In certain cases, we may have occasion to give two tests which measure rather 
different traits. For example, we may give a test of long term retention, and a test of 
general intelligence. It may be the case that these tests have a very low intercorrela- 
tion in the general population, so that quite large discrepancies between these scores 
could be quite ‘“‘norma!”’ or usual in the general population. Nevertheless on clinical 
grounds, we might expect our patient to have a lower memory test score than a 
general intelligence test score. We are noi implying that this would be an abnormally 
large discrepancy. Many people may have as large differences. We are implying, 
however, that it is a measurable difference. We know that neither test is perfectly 
reliable, so that small differences will occur by “‘chance’’. What we wish to know is 
how large a difference between any two scores must be before we can be sure the 
difference could not be due merely to error of measurement of the tests. 


3. Testing a Clinical Prediction 


A third type of problem is slightly different. Very often the clinical psychologist 
finds himself repeating a measurement with a certain expectation or ‘“‘prediction’’. 
For example, a patient may obtain an ‘“‘average’’ IQ when first seen. Two years later, 
there may be strong clinical grounds for believing that deterioration has taken place. 
We, therefore, wish to retest him on the same (or a similar) test of intelligence to 
confirm the hypothesis that he has deteriorated. We may, indeed, find that his score 
is now below average. Have we in fact confirmed our hypothesis? 

Again we know that tests are not perfectly reliable and that such changes in 
score occur in perfectly normal people. Essentially we need a control group. We need 
to know what proportion of individuals like our patient, of the same IQ on the first 


*Editorial Note: The authors use the term “standard error of prediction” which is customarily 
called a “standard error estimate” in America, and in this country we uniformly refer to the “pro- 
portions” or “probabilities” corresponding to various standard score values instead of the term “‘per- 
centile”. 
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test, and who have not deteriorated would show an equal drop in IQ on retest. If 
the figure is fairly large, of course, our result does not prove that deterioration has 
really occurred. The practising psychologist will not have time to conduct the ap- 


propriate control experiment. Is there any other way of providing an approximate 
answer? 


SUGGESTED SOLUTIONS 


The clinical teaching section of the Institute of Psychiatry (Maudsley Hospital) 
became aware of these problems several years ago. The following simple statistical 
models were eventually evolved and have since been applied routinely in clinical 
practice.' 


1. The Abnormality of a Discrepancy Between Two Scores 


Let us call the two raw scores X and Y. We are required to discover how fre- 
quently a discrepancy or difference between them occurs in the general population. 
Let us call this difference D, which is merely X - Y. Provided that the bi-variate 
distribution of the two scores is normal, the percentile value for any D score can be 
obtained from the ordinary table of the normal curve relating percentiles to standard 
scores. All we need to do, is to express our D score as a standard score in the usual 
way, substituting in the formula: 


Za (standard “D”’ score) = D — D 
74 


In this equation D = X — Y, D (mean D) = X — Y, and gg (standard error 
of difference, or standard deviation of the D. scores) = 


\ oy + ay? — 2ry ox Gy 


Where o, is of course the standard deviation of test x, c, the standard deviation of 
test y, and r,, the Pearson product-moment correlation between tests x and y. 

This general method will tell us how frequently this particular D score occurs 
in the standardization population. Our D score, however, was the difference between 
two raw scores, and as such is influenced by both X and Y. If, however, test x has 
a very much larger standard deviation (and range) than test y, score D will be in- 
fluenced much more by score X than score Y (i.e., D scores will correlate higher with 
X than with Y scores). In an extreme case, an “abnormally” (infrequently) large 
X score would automatically produce an “abnormally” large D score. This would 
be the case if raw X and Y scores are used. Psychologically, however, we are not 
really concerned with the discrepancy between raw X and Y scores, as raw scores on 
most psychological tests are quite arbitrary. What we are concerned with is a dis- 
crepancy between the percentile standing on test x and the percentile standing on 
test y. Thus, it is really the “abnormality” (frequency) of a discrepancy between two 
percentiles that concerns us. We can easily estimate this if we first express our raw 
X and Y scores as standard scores, and then assess the frequency of the difference 
between these two standard scores in the general population. 

Thus, we can first transform our raw X and Y scores into standard scores ac- 
cording to the formulae: 


i ED ei Re SF 
Ox Oy 


We can then find the difference between the two standard scores: D, = Z, — Z, 
(D, = difference between standard scores). This difference (discrepancy) can then 





1The use of the regression equation to answer the third type of problem has been illustrated by 
Slater“) in a slightly different form. The techniques to be discussed were largely initiated by Dr. A. 
Lubin, Dr. M. B. Shapiro and Professor H. J. Eysenck in mutual discussion. The formulae quoted in 
subsequent sections for the standard error of a difference, and the regression equation are to be found 
in any standard textbook of statistics, for example P. O. Johnson @) or Q. McNemar®). 
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itself be expressed as a standard score, so that its percentile position can be ascertain- 
ed: 


D, — D, 
Za, = \ Oe + os) — xy On, Ory 


However, D, (the mean difference between standard scores of x and y) is zero, 
as standard scores are made to have a mean of zero. Similarly o,, (the standard 


deviation of Z,. or x standard scores), is one, as standard scores are made to have a 
standard deviation of one. Thus: 


— 3 
= Viti -2ry 








By consulting the tables for the normal curve, we can transform this Z into a 
percentile and discover how frequently the difference occurs in the general popular 
tion. A “two tailed” test tells us how frequently a discrepancy of this magnitude o- 
greater between the two standard scores occurs in the general population, regardless 
of the direction of the discrepancy. A “one tailed” test tells us how frequently the 
difference occurs in this particular direction (for example how frequently Z, exceeds 
Z, by this degree in the population). 

Example: 


A man of 25 obtains a Wechsler“ Verbal Scale IQ of 120, and a Wechsler Per- 
formance Scale IQ of 105. We wish to know how common such a large discrepancy 
actually is. We know that the mean of both Wechsler scales is 100, and the standard 


deviation is 15. Thus, the Verbal Score, expressed as a standard score = i 


15 
= 1.33. The Performance IQ expressed as a standard score is we —  .. 0.33. 


15 


The difference between these two standard scores (Verbal minus Performance) 
is 1.00. The correlation between the Wechsler Verbal IQ and Performance Scale IQ 
is 0.71(*»- '*4)2. Thus the standard error of the difference between these two 
Wechsler standard scores would be: 


v 1+1 — 2(0.71) = 0.762 


Dividing the difference (minus the mean difference — in this case zero) by the 
standard error of the difference, we find that: 


1.00 
“= 0-762 





= 1.31 


Consulting the table for converting standard scores into percentiles®: »- **) we 
see that this corresponds to a percentile value of 90.5. Thus we see that 19% of the 
standardization population would have a discrepancy this large (or larger) between 
Verbal and Performance IQ (the “two tailed” test); 9.5% of the standardization 
population would have a discrepancy this large (or larger) in favor of the Verbal 
IQ, as in this case; and 9.5% of the standardization population would have a dis- 
crepancy this large (or larger) in favor of the Performance IQ (the ‘“‘one tailed” 
test). Whether this discrepancy is infrequent enough in the general population to be 
regarded as “‘abnormal’’, and perhaps worthy of further investigation, must be left 
to the judgment of the individual clinical psychologist. 


*We would not use the value of .83 which Wechsler also quotes after correction for attenuation, 


as we are concerned here with the empirically found range of verbal-performance discrepancies in the 
standardization population. 
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2. The Reliability of a Discrepancy Between Two Scores 


Let us again call the two raw scores X and Y. We are required to discover 
whether the discrepancy between them (D, or X - Y) is large enough to be outside 
the range of differences attributable to the errors of mesaurement of the two tests. 
The “standard error of measurement” is the usual psychological estimate of the 
range of error which we can expect from a single test. The standard error of measure- 
ment is usually defined according to the formula: 


8.E.. = oz V1 — fe 


where r,, refers to the coefficient of reliability of the test concerned. This is either a 
test-retest product moment correlation coefficient, a ‘‘split-half” product moment 
correlation coefficient corrected for attenuation, or sometimes a form vs. form cor- 
relation coefficient. 

If we accept the equation for the standard error of measurement, then the proba- 
bility of an obtained score ‘‘X” on any test departing from a given score which we 
could call ‘“T” through error of measurement, can easily be obtained according to 
the formula: 

xX -—T 


Ox Vv 1 — ry 


Thus (c, ¥ 1 — r,,)? represents the error variance of test x. However, we know 


that the standard error of a difference between two scores X and Y is given by the 
formula: 





SE diff = Vv of + co, -2ro, ay 


If we replace the variance of X by the error variance of X and the variance of Y 
by the error variance of Y in the above equation, we will estimate the standard error 
of the differences attributable solely to error. As error is uncorrelated, however, the 
last term (—2r o, oy) will become zero. (Errors of measurement on test x are uncor- 
related with errors on test y). Thus the standard deviation of the discrepancies be- 
tween X and Y attributable solely to error of measurement as defined, will be given 
by the formula: 





o error diff = \ (os Ji~-t& Y+(,VvV1—ty)? 


If we wish to determine the probability of any difference (X — Y) between 
two scores being due solely to the errors of measurement of the two tests, we can 


thus set up the following ratio: 


D-D 
Z=  Vi-tm)? +l V1—ty) 
D=X-YandD=X-Y 


As we have seen, however, D, the difference between X and Y in raw scores, can 
be correlated much more highly with X than with Y, if ¢, is much greater than o,. In 
an extreme case for example, where the range of X is very much larger than the range 
of Y, an extremely high X score would always be “reliably” different from a Y score 
of any value, whatever the reliability of test Y, if the above formula were used. In 
other words, the judgment of whether or not the difference D were large enough to 
be measurable, would be almost entirely a function of the size of X. This is clearly 
not what is required. Again, this difficulty can be surmounted if we first equalize the 
variance of X and Y by converting both scores to standard scores. We can then 
apply the formula above, and obtain a valid answer. In standard score units, the 
formula becomes: 





where: 
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D, — D, 
V (Vim +( V1 ty) 
Dy 
Vv (1 — trax) + (1 — Pyy) 


D, = Z, — Z,, D, = Z, — Z, = 0, 
xx = coefficient of reliability of test x, and 
yy = coefficient of reliability of test y. 








Our standard score, Z, can be converted into a percentile by consulting the 
table for the normal curve in the usual way. 


Example: 


A male patient of 20 was given the ‘General Aptitude Test Battery” of the 
United States Employment Service.©? It was expected from his Wechsler results 
that his Spatial Aptitude would be higher than his “G”’ score on the GATB. He 
obtained a “‘G”’ score of 75 and a “‘Spatial’’ score of 86. We wish to discover whether 
or not this difference is in fact measurable, that is, whether a difference this large is 
within the range of differences one might expect from error of measurement. 

If we call the “Spatial” score ‘“X”’ and the “G”’ score ““Y”’, we can first convert 
these scores into standard scores. We know that for both these tests, the mean is 100 
and the standard deviation is 20. Thus: 

xX—-X _ 86 — 100 Y—Y _ 75 — 100 
Z, = or 20 0.70 and Z, a 20 = —1.25 


The coefficient of reliability of the Spatial test (test ““X’’) is .88, and the co- 
efficient of the ‘“G’’ test (test ““Y’’) is .94. These are test-retest reliability coefficients 
quoted in the test manual: ». ') for a male population comparable to the patient. 
We can now substitute in the equation: 


D, 
o* ST-t tt a0 


= * 0.55 — 0:55 _ 
Vv (1 — .88) + (1 — .94) 0.424 





where D, = Z, — Z, 





1.30 


Consulting the table for the normal curve we find that a Z of 1.30 corresponds to 
a percentile value of 90.3. Thus differences as large or larger than our eleven point 
discrepancy between “‘G”’ and ‘Spatial’ scores could occur 19.4% of the time 
through error of measurement alone. Differences in the expected direction (i.e., 
“Spatial” higher than “‘G’’) could occur 9.7% of the time through error of measure- 
ment (the one tail test). Thus we cannot be certain that this test discrepancy repre- 
sents a “real’’ difference between these two aptitudes, and the assessment of these 
probabilities must again be left to the individual clinician. 


3. Testing a Clinical Prediction 


If we have given a certain test to a patient, and expect a drop in score following 
some trauma or other circumstance, and subsequently retest our patient on the same 
test, we wish to discover what proportion of subjects, all of whom had the same initial 
score on the test, but who did not suffer the trauma, would be expected to show a drop 
this large. This is a relatively simple matter, if we have the relevant data for the 
test in question. What is required is the test-retest data for the test over an equiva- 
lent period of time, for a representative sample of subjects similar to the patient. 
In other words, our patient’s original test score must fall within the distribution of 
scores obtained by our statisical ‘control group’’ on the initial testing. 
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Let us call the initial test score X, and the score obtained on retest, Y. The 
regression line of Y upon X is that straight line which best fits (by “least squares’) 
the points sepresented by the mean Y score of each group (column) of subjects with 
homogeneous X scores. If we use Y to represent that Y score which would be pre- 
dicted from the regression equation from our patient’s X (initial test) score, then Y 
also represents the mean Y score (retest score) of all those subjects who started out 
with the same X (initial test) score as our patient. It is this group then, which 
serves as our patient’s “control” group. The standard error of prediction using this 
regression equation represents the standard deviation of the Y (retest) scores of all 
those patients who started out with the same X (test) score as our patient. We have 
only to determine then, whether our patient’s Y (retest) score falls within or outside 
the distribution of Y (retest) scores obtained by those people with the same initial 
X (test) score. 

In practice then, we would first discover the average Y (retest) score of those 
people with the same initial X (test) score as our patient, by substituting in the 
regression equation: 


Y=a+bxX where a=Y-—bX and b = ry ay 
Ox 

Y = mean retest (Y) score of people who start out with score X 
X = patient’s initial score 
X = Mean of all subjects on test X (test) 
Y = Mean of all subjects on test Y (retest): 
o, = standard deviation of all subjects on test X (test) 
oy = standard deviation of all subjects on test Y (retest) 


rxyy = product moment correlation between test x and test y (i.e. test-retest 
correlation) 


Having discovered our value for Y, we can then discover the standard deviation 
of the Y scores (retest scores) of those people with the same initial X (test) score. 
This is the standard error of prediction, and is merely: 


S.E.predic. = ty V 1 — My’ 
Having obtained this value, we merely determine whether our patient’s score 
falls within or outside this group of retest scores for people with the same initial 


score, by expressing our subject’s score as a standard score for this distribution ac- 
cording to the formula: ¢ 
Wi is 


as S.E. predic. 


The final Z value can be converted to a percentile by consulting the standard 
table of the normal curve. Since a prediction has been made in each case where this 
statistical model is used, a one tailed test of significance is appropriate. 


Example: 


A 30 year old schizophrenic patient is given the Wechsler Bellevue Form I, and 
obtains a total weighted score of 110 (IQ = 112). He is believed to have deteriorated 
intellectually following treatment with Serpasil, and is retested one month later. He 
now obtains a total weighted score of 83 (IQ = 94) on Wechsler Form I. We wish to 
discover how many schizophrenics of this initial IQ who have not had any treatment 
intervening between test and retest on the Wechsler over a four week period, would 
have dropped to this extent. 

Some control data is given in an article by Hamister®) who tested a group of 
34 schizophrenics on Wechsler I and retested them four weeks later. Their mean 
total weighted score on the first test was 92.97, with a S.D. of 26.00. Their mean 
total weighted score on retest was 104.26, with a S.D. of 28.91. The correlation 
between test and retest was .84. 





STATISTICS FOR THE INVESTIGATION OF INDIVIDUAL CASES 121 


First we must use the regression equation to discover what the average retest 


score (Y in the formula) was in Hamister’s group, for patients with the same initial 
Wechsler score: 


Y=a+bxX 
where a = Y-bX 
and b = fy oy 
Ox 
We know that Y = 104.26 (retest mean, X = 92.97, oy = 28.91, o, = 26.00, 


yy = .84, and X (patient’s initial score) = 110, and Y (patient’s retest 
score) = 83 


Therefore: 
b = .84 X 28.91 = 0.93 


26.00 
and: 


a = (104.26-—0.93) K 92.97 = 17.80 
Thus: 
Y = (17.80 + 0.93) X 110 = 120.10 
The standard deviation of retest (Y) scores of schizophrenics with this initial X 
score is given by the formula: 
S.E. predic. = oy V1l—ty’? = 28.91 ¥ 1—842 = 15.69 
Thus the patient’s standard score position in the distribution of retest scores 
obtained by patients with the same initial test score is given by the formula: 
tuts 8-10.00 _ _oay 


i te ~~ 15.69 


Consulting the table for the normal curve, this is equivalent to a percentile value 
of 1. Thus we see that only 1% of the schizophrenics in the “‘control’”’ sample who 
started out with this IQ on the Wechsler, would have dropped as much on retest 
after four weeks as has our patient. This is consistent with our hypothesis that he 
has deteriorated. 


SUMMARY 
This paper suggests simple statistical models for use in solving the following 
problems: 


1. To estimate how large a discrepancy between any two test scores need be, for 
it to be “abnormally” large in the standardization population. That is, to estimate 
the frequency of occurrence in the standardization population of any given dis- 
crepancy between two test scores. 


2. To establish how large a difference between two test scores need be, for it to be 
outside the range of differences produced solely by the errors of measurement of the 
two tests. That is to estimate how large a discrepancy must be for us to judge it a 
“measurable”’ difference. 


3. To estimate how large a predicted change in score (following treatment, trauma, 
etc.) need be, to lie outside the range of changes found in a control group which has 
not been subjected to the intervening process. 
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SOME PATTERNS OF DEPRESSION 


JAMES P. O’CONNOR EDWARD C. STEFIC CLEMENT J. GRESOCK 
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PROBLEM 


In his review of studies involving the scales of the Minnesota Multiphasic 
Personality Inventory, Cottle“ concludes that the depression (D) scale is one of the 
most important. Others have described it as an indicator of the seriousness of person- 
ality disintegration “®, as the most sensitive scale to change in psychiatric therapy ®, 
and as the best single index of adjustment in the inventory). Both Meehl"* and 
Kaufmann? report that the D scale tends to be highest in most abnormal profiles of 
MMPI scores. Despite the apparent clinical significance of the D scale there has 
been no systematic investigation of the scale items for the purpose of determining 
its internal structure. At present we have little more than the statement of Hatha- 
way and McKinley“? that a high D score indicates poor morale, feelings of useless- 
ness, inability to assume an optimistic outlook, lack of self confidence, tendency to 
worry, narrowness of interests, and introversion. 

Factor analytic studies of the MMPI“: 3: 17. 18. 1) which have all been restricted 
to inter-scale correlations, suggest that D is functionally complex. Most of them 
have yielded two major factors with D loading significantly on both. In view of the 
nature of the variables involved these probably represent second-order factors ex- 
tracted at the first-order level. In any case, previous studies reveal little of the num- 
ber and nature of the underlying factors in the D scale itself. Consequently, it is the 
purpose of this study (a) to determine the number of factorial dimensions in the 


MMPI D scale by an analysis of the scale items, and (b) to ascertain what these 
dimensions are. , 


PROCEDURE 


The subjects were 300 white, male, veteran, neuropsychiatric hospital patients 
of mixed diagnoses who were administered the booklet form of the MMPI as part of 
a standard admissions battery. Their mean age was 32.36 years, with a standard 
deviation of 7.35 years. 

Tetrachoric correlation coefficients (cosine pi) between the 60 D scale items 
. were computed. Items having marginal frequencies of less than 10 per cent or show- 
ing uniformly low correlations were dropped. The resulting 49 x 49 matrix was 
factored by the complete centroid method until five factors were extracted. The 
fifth factor residuals had a mean absolute value of .073 and their distribution was 
symmetrical about zero. The orthogonal factors were rotated obliquely by two- 
dimensional sections until simple structure was approximated. The inter-correlations 
between the primary factors were factored and rotated orthogonally to determine 
possible second order factors. 


RESULTS AND DISCUSSION 


The rotated oblique factor matrix V is presented in Table 1.! In interpreting 
the factors, those items with loadings of .35 or more will be given principal con- 
sideration. 

Most of the items which are significantly correlated with Factor A, and which 
show negligible loadings on the other factors, are concerned with expressions of 
general ill-health. These include statements involving a denial of the following: that 
one never felt better, that one has been well most of the time, that one’s appetite is 
good, that one’s health is as good as most of his friends, and that one is about as able 
to work as he ever was. This picture is rounded out by descriptions of fitful and disr 


1The centroid matrix and transformation matrix are available from J. P. O’Connor, Catholic 
University, Washington 17, D. C. 
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TaBLe 1. OsLique Factor Marrix V. 








Factors 
C 
Good appetite (F) 5 f -10 
Easily awakened by noise 3: -07 
Life full of interests (F) f 28 
As able to work as ever (F) : ¢ -05 
At times I feel like swearing 38 
Hard to keep my mind on a task ‘ 20 
Seldom worry about my health (F) -12 
At times I feel like smashing things R 59 
Periods when I couldn’t get going , 13 
Sleep fitful and disturbed ‘ -10 
Judgment better than ever (F) g } 02 
Good physical health (F) : 4 
Pass by friends 25 
Am a good mixer (F) ‘ 26 10 
veep at a thing until others lose patience ‘ 41 
Ww ish | I was happy as others 
Lacking in self-confidence 
Usually feel life is worthwhile (F) 
Takes lot of argument to convince people 
Don't seem to care what my a ns to me 


Items* 





DOWNMo Pore 


Am happy most of the time 

Capable and smart as most others (F) 

Never vomited or coughed up blood (F) 
Criticism or scolding hurts terribly 

Feel useless at times 

At times feel like picking a fist fight 

Sleep without ideas bothering me (F) 

Been well most of the time (F) 

Never had a fit or convulsion (F) 

Cry easily 

Can’t understand what I read as well as I used to 
Never felt better in my life (F) 

Memory seems all right (F) 

Afraid of losing my mind 

Feel weak all over much of time 

Sometimes when embarrassed | sweat 

Do not have hay fever or asthma (F) 

Enjoy play and recreation (F) 

Like to flirt (F) 

Have stood in the way of people 

Brood a great deal 

Dream frequently about things best kept to self 
No more nervous than most others (F) 
Sometimes without reason feel excitedly happy 
Sweat very easily 

At times am full of energy (F) 

Troubled by nausea and vomiting 

Work under a great deal of tension 

Have periods of unusual! cheerfulness 


*See MMPI for complete statement of items. 





turbed sleep, of being troubled by attacks of nausea and vomiting, of working unde- 
a great deal of tension, and of feeling weak all over much of the time. Also related 
to Factor A are items in which the patient denies both that his judgment is better 
than it ever was and that his memory seems to be all right. This factor is interpreted 
as one reflecting hypochondriasis and appears closely related to the factors similarly 
identified by Eysenck “? and Cottle“. After designating Factor A as ‘‘hypochond- 
riasis” a check was made of the D scale items which overlap those of the MMPI 
hypochondriasis scale. It was found that six of the eight items common to both 
scales are represented in Factor A and that the two remaining items were not among 
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those originally selected for analysis. This agreement tends to lend support to our 
inference concerning the nature of Factor A. 

Factor B seems to represent a cycloid parameter characterized by moodiness 
and general excitability or agitation. Typical items involve crying easily, having 
periods in which one can’t get going, feeling excitedly happy or cheerful without any 
reason or even when things are going wrong, and finding it hard to concentrate. 
Statements reflecting insomnia, brooding and somatic complaints are also correlated 
with this factor. The cycloid factor of Mosier“* and the moodiness factor of Lay- 
man“®) were both isolated from questionnaire data and very closely approximate 
Factor B of this study. They both include mood swings, low spirits and feeling miser- 
able as their most significant items. Factor B also appears related to the factor of 
melancholy agitation of Lorr, Jenkins, and O’Connor“) derived from ratings of 
observed behavior. 


Almost all of the significant items of Factor C are of unit complexity and they 
seem to characterize it as primarily one of hostility or belligerence. High loadings on 
statements expressing desires to smash things, pick a fist fight, and swear clearly 
delineate a rage-like reaction. Consistent with the clinical notion of hostility are 
such items as not seeming to care what happens to one’s self, keeping at a thing until 
others lose patience, and having had a fit or convulsion. The only items with load- 
ings greater than .35 which seem somewhat foreign to the general hostility pattern 
are ones in which liking to flirt and never having vomited or coughed up blood are 
denied. Factor C, based on self-descriptive data, bears a close resemblance to the 
belligerence or hostility factors isolated in previous studies“! ) from observations 
of clinical behavior. 


Inspection of the items which load significantly on Factor D leaves little doubt 
that it is the same factor of inferiority or sense of personal inadequacy isolated by 
O’Connor, Lorr and Stafford“® in their recent analysis of the Taylor Anxiety Scale. 
The inferiority factor has also been identified in several previous studies“ ™ ™), 
Factor D is associated with feelings of uselessness, lack of self-confidence, extreme 
sensitivity to criticism, avoidance of others, a wish to be as happy as others, and 
difficulties in reading comprehension. 

Factor E is descriptive of a shy, self-effacing, melancholic desurgency or de- 
pression. It is somewhat similar to the negative pole of Cattell’s“ factor F (surgency 
vs. desurgency), and to Mosier’s“* factor of depression. The significant items on 
Factor E include those in which one denies the following: that he is full of energy, 
feels life is worthwhile, is as capable and smart as most others, likes to flirt and is a 
good mixer. It is also denied that one’s judgment is better than it ever was and that 
one has periods in which he feels unusually cheerful without any special reason. 

The correlations between the primary factors (Table 2) were factored by the 
complete centroid method and rotated orthogonally until simple structure was ap- 
proximated. Three second-order factors (Table 3) were isolated, but only one, 


TABLE 2. TABLE 3. 
CoRRELATIONS BETWEEN PRIMARY FACTORS Sreconp OrpDER Factor MATRIX 
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Factor Y seems fairly well defined. The cycloid reaction (B) and depression (E) have 
significant loadings of unit complexity and suggest that Y is similar to the depressive 
pole of Degan’s®? second order factor which he labels paranoid-depressive. This end 
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of his bipolar factor is defined by his first order factors of neurasthenia and de- 
pression. 

Factor X with high loadings on hypochondriasis (A) and inferiority (D) vaguely 
suggests a general malaise involving both physical and psychological discomfort. 
Factor Z is too ill-defined for even the most tentative interpretation. 

The results of this study indicate rather clearly that the MMPI D scale is a 
complex measure consisting of at least five separate parameters. While these factors 
are consistent with the common clinical picture of depression they cannot be said to 
“add up” to a depressive syndrome. Since the dimensions underlying the scale are 
relatively independent, a simple additive D score is just as ambiguous as the total 
score on a test of general intelligence. In both cases the pattern of differential re- 
actions is lost by summating. It is believed that clinicians who use the MMPI would 
obtain more meaningful behavior descriptions if they considered the pattern of item 
clusters within each scale. This procedure would not be essentially different from the 
present practice of MMPI profile analysis using scale scores. The results of the 
present analysis could provide the foundation for intra-scale analysis of D, but similar 
studies of other MMPI scales are needed before this method can be more generally 
applicable. 


SUMMARY 


The items of the MMPI depression scale were intercorrelated (r:) over a popula- 
tion of 300 white, male, veteran, neuropsychiatric hospital patients of mixed diag- 
noses and the resulting matrix was factored and rotated obliquely to simple structure. 
The five factors which were isolated were identified as hypochondriasis, cycloid 
tendency, hostility, inferiority and depression. Three second-order factors were ex- 
tracted, but only one, depression, could be identified with any degree of certainty. 
The results indicate the multidimensionality of the D scale and call into question the 


practice of attributing to the scale a simple unitary significance. 
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THE ABSTRACT THINKING ABILITIES OF MENTAL PATIENTS!:? 
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PROBLEM 

Psychological tests purporting to measure abstract thinking play an important 
role in the armamentarium of the clinical psychologist “: §- ". 2). Since many differ- 
ent kinds of tests are used from which inferences are made about patients’ abstract 
thinking abilities, it is of interest to determine the relationship among these alleged 
measures On a patient sample by means of factor analysis. Conceivably this might 
serve as a preliminary step toward evaluating the validity of certain commonly held 
clinical assumptions and help to provide a systematic framework for test rationale 
within which clinicians can function more accurately and effectively: “ *%. The 
major purpose of this study is to search for common factors in a wide variety of 
clinical psychological measures which are assumed to bear on the abstract thinking 
abilities of psychiatric patients. 

In order to investigate this problem, it was necessary to select certain psycho- 
logical tests which are used to measure the ability to abstract or its loss. The major 
criterion for selection in the test battery was that the tests employed should throw 
light on the abstract thinking processes. It was considered advantageous for the 
battery to include both verbal and performance tests so as to encompass a greater 
variety of functions. Also it was desired that tests amenable to quantification should 
be employed as well as tests enjoying popular clinical usage. 


TEstTs 
The final battery of tests included the Wechsler-Bellevue Intelligence Scale 


Form I subtests of Vocabulary, Similarities and Block Design “?, an inverted Digit 
Symbol test, the Rorschach, Color-Form“, Supraordinate-Subordinate @*, the 
Similarity-Difference Association test ®, an Oscillating Figures test, the Geometric 
Cross-Out test °*), and the Mirror Tracing test. In all an eleven-test battery was 
used from which estimates of thirty-two variables were obtained. Fourteen of the 
variables came from the Rorschach test (variables 1 through 14), four from the 
Similarity-Diference Association test‘ (variables 16, 17, 18, 19), three from the 
Wechsler-Bellevue Intelligence Scale Form I (variables 26, 27, 28), three from the 
Color-Form Sorting test (variables 15, 29, 30), two from the Geometric Cross-Out 
test (variables 20, 21), two from the inverted Digit Symbol test (variables 22, 23), 
two from the Mirror Tracing test (variables 24, 25), one from the Oscillating Figures 
test (variable 31) and one from the Supraordinate-Subordinate test (variable 32). 


POPULATION 


Fifty white, male hospital patients were randomly selected as subjects from the 
Acute Intensive Treatment section of the Veterans Administration Neuro-psychia- 
tric Hospital at North Little Rock, Arkansas. The mean age of the sample at the 
time of testing was 31.1 years with a standard deviation of 8.4. The mean number of 


years of education was 9.6 which is equivalent to slightly more than a year and a 
half of high school. 


ae paper was presented at the 1956 annual meeting of the American Psychological Association 
in Chicago. 

*The data for this study were collected while the senior author was a trainee in the Veterans Ad- 
ministration Training Program for Clinical Psychologists. This a oy is a major revision and ex- 
tension of the senior author’s doctoral dissertation. The authors would like to express their apprecia- 
tion to Dr. Henry N. Peters for providing the —— impetus for this study and to Dr. Albert Kubala 
for his advice and assistance in the mechanics of factor analysis. 

*Now at the 1100th USAF Hospital, Bolling AFB,, Washington, D. C. 

‘This test was analyzed and scored on the basis of the conceptual schema described by Rapaport “#) 
for distinguishing among the concrete, functional and conceptual or abstract levels of response. 
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The sample had ten different psychiatric diagnoses. The complete breakdown 
was as follows: of the fifteen schizophrenics tested, nine were diagnosed schizophrenic 
reaction unclassified; three paranoid; two catatonic; and one hebephrenic. Of the 
fifteen neurotic patients, ten were diagnosed anxiety reaction, two conversion re- 
action, and three psychoneurotic. Of the five alcoholics, three were called acute and 
two chronic. Of the five epileptic patients, three were diagnosed grand mal, one 
idiopathic, and one unclassified. In addition to these, there were two diagnoses of 
encephalopathy, one of which was described as traumatic, and the other as post- 
traumatic. Also included in the sample were five patients diagnosed psychotic re- 
action. Of these, one was unclassified; one was called depressive reaction; one was 
described as having brain disease; another, as psychopathic; and the fifth as an 
inadequate personality. The remaining three patients bore respective diagnoses of 
tertiary syphilis, cephalalgia, and manic-depressive reaction. 


METHOD 


Through preliminary testing it was found that the battery was too long to be 
given in one session. In order to keep motivation at a maximum and fatigue at a 
minimum, it was decided to split the battery into two testing sessions with the 
Vocabulary, Similarities, Block Design, Digit Symbol, Color-Form, Supraordinate- 
Subordinate and the Rorschach test administered in the first session, and the Similar- 
ity-Difference, Oscillating Figures, Geometric Cross-Out and Mirror Tracing tests 
given in the second session. Ordinarily both sessions were completed within two 
days. In no case was the interval greater than three days. All testing was individual- 
ly administered by the senior author in the order described. 

Scores having more than two digits were transformed linearly into a range of 
0 to 99 and all scores on Mirror Tracing and Digit Symbol were inverted so that a 
high score would correspond to a good performance. The thirty-two test variables 
were then intercorrelated using appropriate Pearson, biserial and tetrachoric cor- 
relation coefficients.’ These results are presented in Table 1. Variables one through 
fourteen in Table 1 are all Rorschach measures scored according to Beck“), except 
for F+% which was scored according to Klopfer“*’. For our purposes all of the 
Rorschach color responses were combined into one category, since only four pure C 
responses occurred and almost all of the others were CF responses. Variables five, 
six and seven were dichotomized into presence or absence of the respective measure. 
Variables fifteen, twenty-nine and thirty from the Color-Form test were dichoto- 
mized into success or failure in making the required sortings, whether the patient 
sorted initially by form or color, and finally, whether or not patterning was observed 
to have occurred. 

After inspection of the correlations, the matrix was reduced to twenty-four 
variables because certain measures had only few significant correlations with the 
rest of the variables, while others had spuriously high correlations. For example, 
the three Rorschach reaction time variables each had only one significant correlation 
with the remaining twenty-nine variables in the test battery; therefore, only average 
reaction time was retained. The number of similarities on the Similarity Difference 
Association Test had only one significant correlation which is easily a chance oc- 
currence. The Supraordinate-Subordinate Test also yielded only one significant 
correlation and, accordingly, it was omitted from our matrix. Moreover, a practical 
consideration entered into our decision to reduce the size of the matrix, since our 
machine methods for a centroid analysis are capable of handling a maximum of 24 
variables. The factor analysis was then done, using the centroid method of Thur- 
stone"). As guides for when to stop factoring, criteria suggested by Burt and 
Banks® and Guilford and Lacey ® were used. After six factors had been extracted 
an estimated 104% of the common factor variance had been accounted for and the 
sixth factor residuals ranged in size from —.262 to .234. Because of the relatively 


’Tetrachoric r’s were computed for some of our variables since the short range of many of the 
Rorschach scoring determinants made the Pearson product-moment correlation coefficients undesirable. 
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large residuals, a seventh factor was extracted which is not reported, since it did not 
meet either of our two criteria for a significant factor. 

Rotation to simple structure was then attempted without regard to positive 
manifold. As a guide in rotation, the goal was to maximize the number of variables 
with zero projections on the reference vectors. The factors were rotated blindly two 
at a time, using graphs. Six orthogonal factors were extracted, five of which were bi- 
polar. The new factor loadings and their communalities after rotation of axes are 
presented in Table 2. 


TaBLe 2. Rotatrep Factor LoapInGs 





Variable f ke 





] 2 8352 
D% ; 6399 
.* 2 208 1.0002 
M On "7005 7216 
A% ; 3: 5191 
F% 5 6419 
R ‘ 2 4827 
P 


: 6415 
Average Reaction Time 2 ( ‘ 8057 


Success on Color-Form 5 } $394 
Abstract Approach ‘ 2 6658 
Concrete Approach 5 0): ¢ 3073 
Geometric X-Out (Expls.)  ¢ e y : f 4770 
Digit Symbol (Error) 3186 
Digit Symbol! (Time) ) 2 5791 
Mirror Tracing (PNT) j : 
Mirror Tracing (Time) 
Similarities 
Block Design 
Vocabulary 
Sorting By Form 
Absence of Patterning 
Oscillating Figures 

Decimals are omitted. 

*Communalities of the centroid factor loadings. 
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IDENTIFICATION OF FACTORS 

Eight of the twenty-four variables have significant loadings on the first factor.® 
These are Abstract Approach (.70), Success on Color-Form (.68), Vocabulary (.64), 
Similarities (.62), Absence of Patterning (.59), Block Lesign (.55), Digit Symbol 
Time (.49), and Concrete Approach (.41). This is the clearest factor in the group 
and it was identified as abstract ability or general! intelligence. The three marker 
variables from the Wechsler-Bellevue Intelligence Scale are heavily loaded on this 
factor as expected. 

The six variables with significant loadings on the second factor are W% (-.88), 
D% (.79), M (.65), P (.62), F+% (.56) and Digit Symbol Time (.46). It appears to 
be a complex factor which is almost entirely specific to the Rorschach test. This 
factor is difficult to subsume under one label since it is defined primarily by W% and 
D% which apparently represent a gross perceptual approach to the Rorschach 
cards. Moreover, three of the remaining four defining variables on this factor are 
functionally related to the approach variables in the Rorschach scoring scheme which 
further complicates its interpretation. This factor appears to involve perceptual 
accuracy or control as well as perceptual approach. Perhaps these two components 
could be subsumed under a label of perceptual organization. 

The third factor is entirely specific to the Rorschach test. The four variables 
with significant loadings are C (.95), R (.51), A% (-60), and F% (-.50). The pre- 


®Because of the magnitude of the standard error of tetrachoric coefficients and the small N in the 
present sample, only factor loadings of .40 and above will be considered significant. 
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sence of color responses on the Rorschach test appears to be almost a factorially pure 
measure of this factor which we have identified as free responsiveness to the environ- 
ment. The remaining variables which define this factor must be interpreted cau- 
tiously since they may be only instrumental artifacts of the color variable. 

Factors IV, V and VI appear to be mainly residual factors of little theoretical 
interest. Only variables 22 and 10 have moderate loadings on the fourth factor. In 


none of these three residual factors does a pattern emerge clearly enough to permit 
any generalizations.’ 


DIscussION 

The high loading of Success on Color-Form on our first factor is surprising, since 
Goldstein and Scheerer “*’ regard this test largely as a measure of the ability to adopt 
the abstract attitude, and failure on this task is interpreted as evidence of rigidity 
and extreme concreteness. This kind of interpretation is reinforced wheuever | at- 
terning is observed to occur. Yet our variable Absence of Patterning also has its 
highest loading on our intelligence factor which suggests that the rationale offered for 
this test by its leading adherents needs to be changed so that the major role of intelli- 
gence or abstract ability is recognized and emphasized. In this connection, our 
analysis revealed no evidence of a rigidity-flexibility factor which had loomed large 
in our preliminary conceptualization of the variables. Our experience here is similar 
to that of Cattell“) who concluded that “the confident clinical conceptualization of 
rigidity as a unitary quality of the individual to which the present writer once sub- 
scribed must be abandoned, except as a very narrow factor in a set of very similar 
and largely motor tests.” 

The significant loading on our first factor of abstract ability of the Concrete 
Approach variable merits special comment since this finding is in direct opposition 
to much of current clinical psychological practice and theory “: '* *2). Moreover, our 
results here are similar to those reported by Stacey and Spanier®°’, who found that 
contrary to expectation, in their group of superior intelligence, the functional de- 
finitions on Wechsler-Bellevue I Vocabulary were actually associated more with 
lower intelligence than the descriptive type or concrete definitions. This finding was 
in general agreement with two earlier studies “*: '*) using subnormals, which suggest- 
ed that the descriptive type definition on Wechsler-Bellevue I Similarities is actually 
of a slightly higher level than the functional definition. These findings are in agree- 
ment with those reported by Chodorkoff and Mussen“?, who compared the vocabu- 
lary responses of a group of normals and schizophrenics and found that the latter 
selected a significantly greater number of function and example types of definition, 
that the normals chose significantly more class definitions and that descriptive (con- 
crete) definitions were chosen with about equal frequency by both groups. They 
also report significant negative correlations between the Shipley-Hartford Con- 
ceptual Quotient and the number of function definitions selected for both groups. 
These findings may have important repercussions on our concepts of the nature of 
intelligence. Specifically they suggest that we might obtain more accurate estimates 
of intelligence if the scoring schemes for certain of the Wechsler sub-tests were chang- 
ed so that a functional type response on Similarities and Vocabulary were to receive 
no credit at all. From a theoretical viewpoint, it appears that the abstract-func- 
tional-concrete dimension would be more congruent with reality in the form: abstract- 
concrete-functional. 

Our second and third factors, which we identified as perceptual organization 
and free responsiveness to the environment respectively, show little obvious re- 
semblance to any of the factors identified by either Wittenborn @’’ or Williams and 
Lawrence 2), Our results do, however, support Wittenborn’s conclusion @’ that 
the factorial composition of the human movement response is distinctly different 
from the factorial composition of the color form response. Our findings do not cor- 


~ These three factors each have more than half their loadings below two standard errors“); hence, 
the significance of these factors should be suspect. All of these factors do, however, meet the criterion 
for significance suggested by Guilford and Lacey. 
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roborate Williams and Lawrence’s conclusion ° that ‘‘the appearance of an ‘intell- 
igence’ factor supports the belief that certain Rorschach determinants covary with 
intelligence.”’ In our study none of the Rorschach variables had significant loadings 
on our intelligence factor, although one variable, namely P, approached significance. 

In a recent paper, Bieri®) reported that college subjects in the M > Sum C 
group generally had significantly larger reaction times to the blots than did subjects 
in the Sum C > M group. Our data provide an additional opportunity to determine 
if there is a difference in the behavior of subjects who show a predominance of one or 
the other of the two response modes, since our sample included some patients who 
gave at least one M response, but no color, as well as some who produced a minimum 
of one color response but no M. The comparison between the seven patients in the 
former group, with the nineteen patients in the latter group, appears relevant to 
Bieri’s hypothesis. None of the three ¢ tests between the two groups for the three 
Rorschach reaction time measures approached significance. Nor were there any sig- 
nificant differences between the two in age, education or intelligence. These findings 
suggest that the relationship found by Bieri in college subjects apparently does not 
hold in a psychiatric patient sample. 

The two measures from the Mirror Tracing test deserve special mention since a 
significant relationship has been posited between the abstract attitude and visual 
motor performance. Peters“*) hypothesized that subjects varied in the abstractness 
of their attitude in their approach to the Mirror Tracing test, which he believed ap- 
peared to be especially adapted to a concrete approach. Our results do not support 
his hypothesis since the Abstract Approach variable is significantly positively cor- 
related with the time score on Mirror Tracing, and the Functional Approach is sig- 
nificantly negatively correlated with this variable. For the part not traced on Mirror 
Tracing, the Concrete Approach is significantly positively related to good perform- 
ance as hypothesized by Peters, but the Functional Approach instead of the Ab- 
stract Approach is significantly negatively related to this measure. These findings 
suggest that Peters’ hypothesis is an oversimplification of the actual relationships 
which exist among these variables. 

Of particular interest also are the relations between Rorschach approach type 
and the abstract, functional and concrete approach scores from the Similarity- 
Difference Association test, since some clinicians tend to equate W with an abstract 
approach“: 3. '7) and D with a concrete, practical approach". Neither of these 
assumptions obtains any support from our data, which suggests instead, that there 
may be a relationship between dd% and a concrete approach. 

Finally, it is of interest to compare our findings on the Wechsler subtests and 
their Rorschach correlates with those reported by Holzberg and Belmont“. On 
Similarities and Block Design they predicted significant positive relationships with 
M and W% and significant negative correlations with D%. None of their reported 
correlations reached the 5% level of significance. Our results are very similar to 
theirs inasmuch as none of our correlations among these variables reached signi- 
ficance either. The question arises, however, as to the basis on which their predic- 
tions were made, since the present writers would agree with their predictions only 
with regard to M. This in turn raises the question as to exactly how typical are 
their predictions as compared with those of clinicians in general. No information 
on this issue is provided in their paper which might also clarify why they found a 
fairly large number of relationships that were not predicted. 


SUMMARY AND CONCLUSIONS 


An exploratory factor study in the general area of abstract thinking was under- 
taken for the purpose of searching for common factors in a variety of measures be- 
lieved to be of value in evaluating patients’ abstract thinking abilities or impairment 
of such abilities. A battery of individual tests was selected which included the 
Wechsler-Bellevue Intelligence Scale Form I subtests of Vocabulary, Similarities and 
Block Design. Also included were an inverted Digit Symbol Test, the Color-Form 
Test, a Supraordinate-Subordinate Test, the Rorschach, a Similarity-Difference 
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Association Test, an Oscillating Figures Test, the Geometric Cross-Out Test, and 
the Mirror Tracing Test. In all an eleven-test battery was used from which estimates 
of thirty-two variables were obtained. The test scores were correlated, and after 
inspection, the correlation matrix was reduced to twenty-four variables. This 
matrix was then factor analyzed, using Thurstone’s centroid method. Six orthogonal 
factors were identified as a result of the analysis, only three of which were inter- 
preted. The first factor was identified as abstract ability or general intelligence; the 
second, as a factor of perceptual organization; and the third, as free responsiveness 
to the environment. Especially noteworthy was the finding that the variable Con- 
crete Approach had a significant loading on our intelligence factor, and that our 
analysis failed to reveal the presence of a rigidity-flexibility factor in our battery of 
abstract thinking tests. Because of the limitations of the present study, 7.e., the 
small sample, and the particular selection of test variables especially from the Ror- 
schach, it is believed that further studies in this area are required in order to deter- 
mine whether the factors identified in this research can be replicated. 
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FACTOR ANALYSIS OF A PARTICULAR ASPECT OF 
BEHAVIORAL CONTROL: IMPULSIVITY! 


DAVID C. TWAIN? 
The Pennsylvania State University 


PROBLEM 

This study is concerned, in general, with behavioral control and particularly 
with behavior commonly referred to as ‘impulsive’. The object of this study is the 
clarification of the concept “impulsivity”’ with specific emphasis upon a delineation 
of the kinds of behavior which might underlie behavior regarded as impulsive. To 
describe the impulsive person as one given to sudden, imprudent and predominately 
affective action agrees essentially with an authoritative psychological dictionary “ 
and with other published psychological and psychiatric definitions. 

Thurstone “*® has published a scale designed to measure “impulsivity,” derived 
through a factor analysis of existing scales. Thurstone describes the impulsive person 
as “happy-go-lucky”’, ‘daredevil’, and ‘carefree’. The impulsive person acts on 
the “‘spur-of-the-moment’’, enjoys competition and changes easily from one task to 
another. Goldman-Eisler®: »- 77?) developed a self-rating schedule for impulsivity. 
She describes a person given to “impulsion” as: ‘‘tending to act quickly and without 
reflection; as making intuitive or emotional decisions and displaying an inability to 
inhibit an impulse.” 

Though this writer has been unable to find any studies which aim directly at the 
investigation of impulsivity, there has been a good deal of writing done in the general 
area. The psychoanalytically oriented writings of Schilder“*) and Rapaport ®: ' on 
thought processes contain discussions concerning the basis of behavioral control, 
concurring for the most part with those of Freud“: * in his formulation of the ‘‘pri- 
mary” and “secondary” processes. Of pertinence, in relation to impulsivity, are 
studies dealing with the development, within the individual, of the capacity to delay 
reaction. Other authors introduce alternative terms for “impulsivity” which are not 
clearly differentiated from that term. Rapaport‘ does not seem to make any dis- 
tinction between the concepts “impulsivity” and “lability”. Rorschach literature 
abounds with references to impulsive behavior and how it is measured by the Ror- 
schach, although little more than a brief description can be found when a compre- 
hensive definition of impulsivity is sought. 

Redl and Wineman“"? in their discussion of the ‘‘control system’’ describe how 
behavior, which would characteristically be imputed to be impulsive, might stem 
from different behavioral reactions. In general, the literature indicates that im- 
pulsive behavior is regarded as a more or less unitary sort of behavior which is similar 
in all those instances in which it appears. It is probable, however, that impulsive 
behavior is composed of several distinctive behavioral characteristics. Our hypothe- 
sis is that tests which measure aspects of behavioral control characteristically repre- 
sentative of “impulsivity”, as currently defined, upon statistical analysis will reveal 
the operation of more than one factor underlying such behavior assuming that in a 
group testing situation, a sound estimate of various aspects of behavioral control can 
be obtained. 


PROCEDURE 

The subjects for this study were 142 women enrolled in summer session classes 
at the Pennsylvania State University. 

The instruments consisted of sixteen tests selected to correspond to aspects of 
behavioral control. Tests 1, 8-12, 15 and 16 are comprehensively described in 
French’s®) monograph. Tests 2 through 7 were standardized by Goldman-Eisler 
on 115 adult subjects. Reliabilities were obtained by the split-half method and the 
Spearman-Brown prophecy formula. The reliability figures for tests 2 through 7, 


‘This report is a summary of a dissertation submitted to the Pennsylvania State University in 
1955. The author expresses his appreciation to Drs. Norman Tallent, Robert Bernreuter, and William 
Ray for their discussion and guidance. 


*Now at Neuropsychiatric Clinic, Fort Belvoir, Virginia. 
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respectively, are: .71, .72, .70, .90, .78 and .88. Thornton * first administered tests 
11 and 12 to 189 beginning college students. Reliabilities were obtained by the split- 
half method and the Spearman-Brown prophecy formula. The reliabilities of tests 11 
and 12, respectively, are: .946 and .951. 

Reliability figures for the other measures utilized in the present study could not 
be obtained. But as Thornton‘) reports, the reliabilities can be estimated. The 
reliability of a test is composed of its specific variance plus its common variance. 
the h® column in Table 1 contains the estimates of the common variance of the tests. 
The reliability of a test can be estimated from the h? column, keeping in mind the 
fact that the reliability will always be greater than h? by an amount representative 
of the specific variance of the test. 


Tue Tests 


Test 1. Speed. This is test number 7 from an analysis by Davidson and Carroll). 
Test 2. Change. Self-rating scale from Goldman-Eisler “), 

Test 3. Ezxocathexis (action orientation). Self-rating scale from Goldman-Eisler. “ 
Test 4. Sociability. Self-rating scale from Goldman-Eisler. 

Test 5. Optimism. Self-rating scale from Goldman-Eisler. “ 

Test 6. Aggression. Self-rating scale from Goldman-Eisler. “) 

Test 7. Autonomy. Self-rating scale from Goldman-Eisler. 


Test 8. IJdeational Fluency. This is test number one from an analysis by Johnson and Rey- 
nolds“, 


Test 9. Height. The height of the tested individual. 
Test 10. Weight. The weight of the tested individual. 


Test 11. Withstanding Discomfort. This test was adopted from studies by Rethlingshafer“?) and 
Thornton“), It is test 9 and 1 in these studigs respectively. 


Test 12. Motor Inhibition. This is test 8 in Rethlingshafer’s“*) study, test 3 in Thornton’s*). 
Test 13. Attitude toward the Germans. This is Thurstone’s“® self-rating scale. 

Test 14. Attitude toward the Chinese. This is Thurstone’s“”) self-rating scale. 

Test 15. Perseveration. This is test 2 in Rethlingshafer’s“) study, test 6 in Thornton’s“*). 
Test 16. Persistence. This is test 3 in Rethlingshafer’s“*) study, test 7 in Thornton’s *). 


The testing was done in groups varying in size from 4 to 20 persons. Subjects 
were assured the testing was confidential. Test booklets were numbered, no names 
were required. Standardized instructions were given to each group. The sum of 
squares and cross products were secured through I. B. M. punch cards. Pearsonian 
correlations were then obtained. 

The statistical technique employed in this study is factor analysis which should 
not be attempted unless there is evidence of significant relationship among the var- 
iables. Such evidence is demonstrated, in the present study, in the matrix of inter- 
correlations of which approximately one sixth are significant at the five percent level. 
The correlation matrix was factor analyzed by the complete centroid method of 
Thurstone“*). Six factors were extracted; at this point inspection of the residuals re- 
vealed them to be inconsiderable. 

Rotation of the oblique reference axis to simple structure followed Thurstone’s 
“5) method utilizing two-dimensional sections. Seven rotations were required. An 
inspection of the final plots revealed no further need for rotation and they were 
accepted as the essential configuration for purposes of interpretation. 


RESULTS 
The factor analysis of the matrix of intercorrelations of the characteristic 
aspects of impulsivity resulted in the determination of six factors. The centroid 
matrix Fy) is presented in Table 1. The communalities shown in column h? of Table 
1 tend to be low. This finding indicates that the tests employed in this study measure 
aspects of behavior which tend to be independent of one another, confirming pre- 
vious studies that most of the tests in the battery had been found to be measures of 

fairly independent behavioral characteristics. 
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The arbitrary orthogonal reference axes were rotated to simple structure. The 
oblique factor matrix Vx, is pres »nted in Table 2. The factors are listed in Table 3. 
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TABLE 3. SIGNIFICANT RoTaTEeD LOADINGS 








Factor Description Traits and Test Numbers 


Flexible Motor Control Motor Inhibition (12) 
Withstanding discomfort (11) 
Physical Status Weight (10) 
Height (9) 
Positive Progressiveness Action orientation (3) 
Sociability (4) 
Optimism (5) 
Aggression (6) 
Attitude toward Germans (13) 
Tenacious Self-Control Perseverance (16) 
Persistence (15) 
Aggressive Instability Autonomy (7) 
Change (2) 
Aggression (6) 
Attitude toward Chinese (14) 
Unidentified Speed (1) 
Attitude toward Chinese (14) 
Ideational fluency (8) 
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IDENTIFICATION OF FACTORS 


Factor I, flexible motor control, indicates that good control over the motor abili- 
ties involved in tracing a line very slowly is associated with the ability to withstand 
the discomfort of a protracted period of holding the breath. Also represented here is 
an element of freedom from conflict; or flexibility. In an “impulsive outburst,”’ there- 
fore, a rather independent factor might be the erratic motor behavior displayed. This 
factor lends itself to the term “lability” referring to the motor reaction aspect of the 
term. 

Factor II, physical status, appearsto be dependent on physical development solely. 

Factor III, positive progressiveness, seems to be concerned with the tendencies 
toward a positive type of orientation and a progressive attitude. One thinks of the 
descriptions of impulsive behavior which utilize such phrases as “happy-go-lucky,” 
“enjoying competition,” and ‘‘action-oriented.” 

Factor IV, tenacious self-control, appears to be involved with self-control of a 
“holding-in”’, conforming nature. Its extreme lack is associated with impulsivity. 
Phrases that seem apt in this regard are: ‘unable to delay reactions” and ‘‘uncon- 
trollable.”’ 

Factor V, aggressive instability, has loadings depicting forcefulness, a negative 
orientation, irrascibility, and the strong desire for change. In contrast to the ‘“happy- 
go-lucky” description, impulsive behavior is sometimes described as ‘‘aggressive,”’ 
“autonomous,” and very ‘‘negative.”’ 

Factor VI did not suggest any clear-cut interpretation. 


SUMMARY 


The object of this study was the clarification of the concept “impulsivity”. The 
literature indicates that “impulsivity” is generally regarded as a unitary sort of be- 
havior, similar in all those instances in which it appears. Our hypothesis was that tests 
which measure aspects of behavioral control representative of “impulsivity” upon 
statistical analysis will reveal the operation of more than one factor underlying such 
behavior. Estimates of behavioral control were gathered in group testing situations 
and factor analyzed. The analysis resulted in the determination of six factors which 
were titled: flexible motor control, physical status, positive progressiveness, tenacious 
self-control, and aggressive instability. Factor VI was not interpreted. The hypothesis 
of this study was confirmed in that the factor analysis revealed the operation of more 
than one factor underlying the variables under study. 
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THE ALTITUDE QUOTIENT AS A MEASUREMENT OF 
INTELLECTUAL POTENTIAL 


MORONI H. BROWN AND G. ELIZABETH BRYAN 
University of Utah 


INTRODUCTION 


Jastak proposed the use of Altitude Scores as an aid in test interpretation 
represented either by the single highest score in a test battery, ® or derived from the 
three highest test scores in the battery.) Certainly Jastak’s desire to obtain a more 
objective altitude measurement is commendable, but without altitude norms it has 
no more meaning than interpretations that are admittedly more subjective. His 
statement that true feeblemindedness is indicated only if the subject fails to surpass 
the lowest two or three percentiles on any or all of the tests in a battery is similar to 
the oft found statement in clinical reports to the effect that, ‘‘Although the IQ score 
would place the subject within the mentally defective range, a higher potential is 
indicated by the wide scatter.’’ Neither statement is based upon appropriate norma- 
tive data and both seem to assume that high deviation on one or two or three tests 
from one’s mean score is an uncommon occurence and is pretty much limited to 
clinical cases. 

Although the information that Wechsler gives on the mean weighted scores and 
standard deviations of each test in his Scale at different ages is helpful in evaluating 
intertest scatter, it offers little assistance in the interpretation of altitude scores. As 


the IQ derives its meaning from IQ norms, so also must the altitude score be inter- 
preted in terms of altitude norms. 


PROBLEM 


The present study assumes that in most W-B test records the subtest scores will 
vary and that the highest scores are the most valid indicators of intellectual potential. 
Therefore, the Altitude Quotient (AQ) will always be higher than the Intelligence 
Quotient (IQ). 

The purpose of the study was twofold: (1) to investigate the magnitude of the 
differences between the AQ and the IQ (AQ-IQ difference) and the nature of the dis- 
tribution of AQ-IQ differences at three age levels and for three IQ classifications; and 


(2) to investigate the relationship between the distribution of IQ’s and the distribu- 
tion of AQ’s. 


PROCEDURE 


The sample was selected from a large number of non-clinic W-B records. These 
records were separated into three age groups and each age group was further separ- 
ated into three IQ classifications. Then each Age x IQ group was separated into male 
and female subgroups. From each of these 18 Age x IQ x Sex groups, the records of 
15 male and 15 female subjects were randomly selected, giving a total of 270 records 
in the entire sample. The age groups were: 10 to 15 years, 16 to 19 years, and 20 to 
29 years. The IQ classifications followed Wechsler’s grouping so that the low IQ 
group included all Full Scale IQ’s below 91, the Medium IQ group included those be- 
tween 91 and 110, and the High IQ group included those above 110. 

The 1Q’s were based upon the weighted scores of all eleven tests in the Scale. 
The AQ’s were based upon the IQ equivalents of Full Scale weighted scores prorated 
from the average value of the two highest scores in each individual record. The de- 
cision to use the average values of the top two scores was based upon two considera- 
tions: first, that chance might be an important factor in producing the highest single 
score, and second, that the concept of altitude is progressively weakened as more 


scores are included in the computation. After computing the AQ and IQ for each 
record the difference between these two measures was obtained. 
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The breakdown according to sex which was used during the sampling procedure 
was not maintained because it was apparent that in each of the Age x IQ groups, the 
magnitude of the AQ-IQ dif erences was about the same for both sexes. 

In order to determine whether or not the magnitude of the AQ-IQ differences 
was affected by age or intellectual functioning level, the AQ-IQ differences for the 
nine Age x IQ groups were subjected to analysis of variance ®). A Pearson product 
moment correlation coefficient °) was used to determine the relationship between the 
AQ’s and the IQ’s. In order to approximate the IQ distribution in Wechsler’s stand- 
ardization sample, all of the subjects in the medium (91-110) IQ range were used for 
this purpose but only alternate subjects in the low and high IQ groups were used. 
Thus, 50% of the subjects were in the medium, or so-called normal IQ range, while 
25% had 1Q’s below 91 and 25% had IQ’s above 110. 


RESULTS 
The IQ’s ranged from 44 to 138 and the AQ’s ranged from 62 to 168'. The mean 
of the AQ-IQ differences for the entire sample was 24.6 with a standard deviation of 
8.1. The means of the AQ-IQ differences for the nine Age x 1Q groups ranged from 
18.2 to 28.1 and are given in Table 1. 


TaB_e 1. Means or tHE AQ-IQ DirrERENcES For AGE AND IQ Groups (N = 270) 
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IQ above 110 27.1 7.3 ’ 18.2 5.5 
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These findings indicate that even the most conservative altitude interpretations 
should allow for expected differences of from 16 to 33 points between AQ’s and IQ’s. 
To the extent that the AQ approximates the general intellectual potential and the 
1Q approximates the level of general intellectual functioning, the magnitude of the 
differences between these two quotients may be said to represent the degree of in- 
dividual intellectual efficiency. A very small difference would indicate that an in- 
dividual is working close to his capacity while a very large difference would indicate 
an inefficient use of abilities. 


Tasie 2. ANALYSIS OF VARIANCE OF THE DIFFERENCES BETWEEN THE 
AQ anv THE IQ 


Source df 8s MS F 
Age ‘ 1,135.76 

1Q 487 82 14" 
Age x1 503.55 14 
Within Cells 15,363 .67 


Total 17,490 .80 














**Significant at the .001 level 
*Significant at the .05 level 


1The IQ’s of the 180 subjects used in the correlation ranged from 63 to 137 and the AQ’s ranged 
from 84 to 168. 
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Table 2 summarizes the analysis of variance of the AQ-IQ differences. An F 
significant at the .001 level was obtained for AQ-IQ differences associated with age 
and an F significant at the .05 level was obtained for AQ-IQ differences associated 
with the intellectual functioning level. 

Table 3 shows that the largest mean AQ-IQ differences occur in the youngest 
age group and that this age group is significantly different from both of the other 
two age groups. 


TaBLE 3. Mean DIFFERENCES BETWEEN THREE AGE Groups, 
Aut IQ’s ComBInEp (N = $0 1n Eacu Group) 





Age Groups 





10-15 yrs. 16-19 yrs. 








10-15 yrs. 
16-19 yrs. 23 .¢ 3.537** 
20-29 yrs. 22. 4.782** 1.245 





Error Mean Square: 58.86 Critical d at .05 level: 2.242 
**Significant at .01 level Critical d at .01 level: 2.951 
: 


Table 4 shows that the mean AQ-IQ differences between. the low and high IQ 
groups were significant at the .01 level while the mean differences between the med- 
ium and high IQ groups were significant at the .05 level. The mean differences be- 
tween the low and medium IQ groups were not significant. 


TaB.Le 4. Mean DiFFERENCES BETWEEN THREE IQ Groups, 
Att Aces Comsrinep (N = 90 1n Eacu Group) 





IQ Groups M TQ below 91 


TQ below 91 5 .§ 
1Q 91-110 25.5 696 
IQ above 110 2. 3.178** 2.482* 





1Q 91-110 








Error Mean Square: 58.86 Critical d at .05 level: 2.242 
**Significant at .01 level Critical d at .01 level: 2.951 
*Significant at .05 level 


According to the analysis, the greatest AQ-IQ differences are found in very 
young subjects and in less intelligent subjects, while the smallest differences are 
found in older adolescents and young adults and in subjects of superior intellect. 
Likely the variations in AQ-IQ differences are an artifact due to the limited range of 
difficulty of the tests in the W-B Scale rather than actual differences between the 
levels of intellectual potential and intellectual functioning of the individuals in these 
groups. This observation that the W-B Scale has neither enough ‘‘bottom”’ for ten 
and eleven year olds and mental defectives nor enough “‘top’’ for intellectually 
superior aduits has received some mention in the literature. If it were not for the 
“top” and “bottom’’ inadequacies of the W-B Scale, it would seem that the magni- 
tude of the AQ-IQ differences would be relatively constant for most subjects irres- 
pective of age level or intellectual functioning level. 

The correlation coefficient between the IQ’s and the AQ’s was .87 and the con- 
fidence limits for the 1% level using ‘‘z’’ transformations®? were .82 and .91. Ac- 
cording to the coefficient of determination, 76% of the variance in the AQ’s is asso- 
ciated with the variance in the IQ’s. This figure is high enough to indicate that the 
W-B IQ is not such a poor indicator of intellectual potential after all; that is, that the 
AQ of a subject is closely associated with his IQ. It would seem that if the W-B Scale 
had enough “‘top”’ and ‘“‘bottom”’ so that the large AQ-IQ differences in the youngest 
age group and the very small AQ-IQ differences in the high IQ group were less ex- 
treme, then there might be a more direct relationship between the AQ and the IQ 
than is indicated by the .87 correlation. However, it is unlikely that a one-to-one 
relationship would exist and the remaining unexplained variation would justify the 
use of both scores in clinical-interpretations. 
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Some clinical psychologists might object to the cold mechanical method of using 
the top two scores in computing Altitude Quotients because the two highest scores 
may be obtained on tests which in themselves are not usually thought of as “‘good’’ 
measures of intelligence. However, Table 5 shows that no test in the Scale was 
always, or even usually, one of the top two tests from which the AQ’s were obtained. 


Tasie 5. RANK ORDER OF THE FREQUENCY OF Usk or Eacu TEstT IN THE COMPUTATION OF 
ALTITUDE QUOTIENTS 
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IQ 91-110 (N = 90) ; 
IQ above 110 (N = 90) 1 
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Age: 20-29 (N = 90) 
Entire Sample 
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Knowing that in about two-thirds of the population the AQ may be expected to 
be 16 to 33 points higher than the IQ, and also that 76% of the AQ is associated with 
the obtained IQ, the clinical psychologist may find differences between these two 
quotients in W-B records diagnostically and prognostically meaningful. This would 
be particularly true in cases where these differences were either very large or very 
small. It should be emphasized these findings were obtained on the W-B Scale and 
that caution should be used in making inferences to the WISC and WAIS. It is un- 
fortunate that altitude norms based upon the standardization samples of the three 
Wechsler Scales are not available. However, the results of the present study may be 
useful since they were obtained on a sample similar in IQ distribution to the W-B 
standardization sample. 

In addition to the objectivity of altitude interpretations which are based on 
normative data, the AQ would seem to be as reliable and stable as the IQ. Although 
a reliability coefficient was not obtained on this sample, Diller and Beechley? made 
a test-retest study of AQ’s with the Stanford-Binet and found that “the altitude age 
derived from the single highest success or the three highest successes, when converted 
into a quotient by dividing by the CA, appears to yield as constant and as stable a 
measure as the IQ. This stability appears to be independent of the size of the differ- 
ences between the AQ and the IQ’’. 


SUMMARY 


The two purposes of this study were: (1) to investigate the magnitude of the 
differences between the AQ and the IQ and the nature of the distribution of AQ-IQ 
differences at three age levels and for three IQ classifications; and (2) to investigate 
the relationship between the distribution of IQ’s and the distribution of AQ’s. On the 
basis of an analysis of 270 W-B records, the mean differences between the IQ (level 
of functioning) and the AQ (level of potential) was 24.6, with a standard deviation 
of 8.1. This indicates that in about two-thirds of the general population AQ-IQ 
differences ranging from 16 to 33 points may be expected. The correlation between 
the distribution of 1Q’s and AQ’s was .87 with a confidence interval between .82 and 
.91. According to the coefficient of determination, about 76% of the variance in the 
AQ is associated with the obtained IQ. 
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REACTIONS OF MEN UNDER STRESS TO A PICTURE 
PROJECTIVE TEST! 


VICTOR B. CLINE ROBERT EGBERT® 
EDWARD FORGY? TOR MEELAND 


The Human Research Unit No. 2, CONARC 
Fort Ord, California 


PROBLEM AND METHOD 


The purpose of this study was to determine the reliability and validity of a new 
picture projection test when used to investigate differences between men who had 
been identified by other methods as outstanding fighters or nonfighters in actual 
combat situations. 

Near the close of the Korean War, a research team from the Human Research 
Unit No. 2 at Fort Ord, California, visited a great number of companies in three 
divisions on the front line that had recently engaged in active combat. Three hun- 
dred and fifty men were selected as being either outstanding fighters or extreme ex- 
amples of nonfighters. Care was taken to document every case from several sources 
and exclude any questionable subjects. Thus the fighter group was composed of men 
who had functioned exceptionally well under stress. The nonfighters tended, in the 
main, to be men who had collapsed and /or panicked under the strain and had shown 
this by various overt acts. 

Three hundred and ten of these men were later pulled back to a rear area and in 
small groups of 12 to 18 given intensive psychological assessments lasting one week. 
Two groups were processed weekly. They took 87 tests and assessment procedures 
ranging from such things as the MMPI and a clinical life-history interview, to phy- 
siological tests, values measures, etc. These tests yielded 417 scores for each subject 
assessed “, 

One of the measures administered was a modified version of the TAT using six 
black and white freehand drawn pictures‘ developed especially for this study. Be- 
cause of the considerable size of the sample tested, and the limited number of re- 
search personnel, it was decided to develop a group form of this picture projective 
test which might retain the clinical complexity of the technique as a diagnostic in- 
strument and lead, as well, to quantified ratings. In concerning ourselves with a 
TAT type of analysis such as this, the primary aim was not to study the personality 
dynamics of a particular individual, but rather to compare two groups of men, one 
the outstanding fighters, and the other, nonfighters, and thereby learning something 
of their similarities and differences. This meant, necessarily, that one would have 
to tally ‘category scores’, and obtain means for each group on a number of di- 
mensions and calculate the significance of differences. 

The first three pictures suggested non-military themes centering around re- 
lationships to parental surrogates, physical energy level, homosexuality, personal 
violence and trauma. The last three dealt with a military environment. Picture one 
shows “father,” “mother,” and “older son”’ figures. The father has his arm around 
the son who is turning away. The mother is looking off into the distance. Picture 
two shows a group of semi-nude men lounging in a room. Picture three shows a man 
with extensive body and head injuries being supported by two other men with a 
crowd of bystanders looking on. Picture four shows a man in a military uniform walk- 


1The research reported here was conducted by the authors while they were employed by The 
George Washington University, Human Resources Research Office, operating under contract with the 
Department of the Army. Opinions and conclusions are those of the authors and should not be con- 
strued as representing those of the Department of the Army. Appreciation is expressed to David King, 
Charles Brown, and Martin Spickler who participated and assisted in this project. 

2Now at the University of Oregon, Eugene, Oregon. 

*N ow at the Brigham Young University, Provo, Utah. 

‘Pictures 1, 2, and 3 were — by Dr. V. B. Cline, Joseph Ataide, Pictures 4, 5, and 6 were 
used through the courtesy of Dr. and Mrs. Rodney Clark. 
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ing down a deserted street. Picture five depicts three figures, obviously soldiers, in 
mountainous terrain. Only partially shown are the feet and legs of another man lying 
on the ground. Picture six is a very ambiguous figure most often referred to as ‘“‘an 
explosion”’. 

Groups of 12-18.men were administered the picture projective test at one time. 
Every man was given photographic reproductions of each picture and a booklet in 
which to write his interpretations, or stories. Instructions were patterned along those 
given with the original Murray TAT pictures. Subjects had five minutes for each 
picture. 

On the basis of a review of the TAT literature®! and pilot studies by the 
researchers, a multiple choice scoring system was devised using the six pictures des- 
cribed above, with 86 primary categories and 335 sub-items. In this way an attempt 
was made to rate the stories on all dimensions on which they appeared to vary. The 
system chosen necessarily borrowed from many others and because of its “ob- 
jectivity” has run the risk of being too molecular with resultant loss of scope and 
content. Four® psychologists scored the same 100 protocols (chosen randomly from 
the larger sample of 310) using this standard system. Five choice IBM answer 
sheets were used by the raters for scoring. This allowed machine item counts and 
quick accurate computation of reliability (or agreement) between scorers as well as 
significance of fighter-nonfighter difference on the various item categories. Each 
psychologist read a story in a particular protocol and checked the most appropriate 
alternative for each item. For example: 


24. Outcome of story (Picture 3) 
——(a) Favorable ending due to hero’s action 
(b) Unfavorable ending due to hero’s action 
(c) Favorable ending due to outside agents action 
——/(d) Unfavorable ending due to outside agents action 
(e) No ending or resolution 
Behavior of Hero (Picture 3) 


(a) Hero has feelings of guilt, shame, or remorse 
(b) Hero blames others for his trouble 
-——(c) Hero feels worthy, respectable, guiltless 
———(d) None of above apply 


Unique items were constructed for each picture after examining a large number 
of responses (stories) in a previous pilot study. These covered such areas and themes 
as: resolution of conflict, endings, handling of anxiety, hostility, authority, guilt, 
conformity, sexuality, coherency, as well as common themes projected into each 
particular picture. Other items related to an overall evaluation of the subject’s six 
stories, such as ‘“‘number of outcomes favorable to hero’, productivity, and ‘“‘mode 
of expression’’, etc. Protocols from a sample of 100 subjects were analyzed by four 
psychologists. Fifty of these protocols belonged to fighters and 50 to nonfighters. 
A coding system was used to conceal from the raters, the fighter-nonfighter classi- 
fication of each protocol. 


RESULTS 


The intelligence level of the troops in this study was found to be somewhat below 
the general army average (86 vs 100, respectively, on Army Aptitude Area I). While 
but very few men were illiterate, (11 of 310) many were not highly verbal and their 
responses to the stories were often terse and brief. Responses to the pictures averaged 
about a paragraph in length. This suggested that in any future work with men of 
lower intelligence it would probably be more productive, if the tester himself re- 
corded the subject’s spoken responses to each picture, rather than requiring him to 
write it in longhand. 


‘One rater had to leave the project after having scored about half of his protocols. Another 
(fourth) man was brought in, taught the procedure and finished the ratings. 
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Reliability. The first question investigated was whether the psychologists were in 
substantial agreement about their ratings or scoring. Inter-rater correlations (tetra- 
choric) were computed on a sample of items for paired raters, psychologists A vs. B, 
and B vs. C. The median correlation was .72 with a range of .10 to .95. The items 
were selected on the basis of their splits (always between 20% to 80% rated) and 
their variety of content. Thus, there was a moderate degree of congruence or re- 
liability among the researchers in their scoring of the same protocols. It might be 
mentioned that the objective types of content material were most reliably rated— 
such as “‘presence or absence” type ratings (e.g. shows anger), compared to qualita- 
tive discriminations where raters could interpret both the variable and the pro- 
jective material in different and ambiguous ways. 


TasiLe 1. Picrure Prosective Test: INter-RATER CORRELATIONS! 








Rater A vs Rater B Rater B vs Rater C 


T No. of Correlations No. of Correlations 





91 -— .99 
.81 — .90 
.71- .80 
.61—- .70 
-51 - .60 
-41 - .50 
.3l — .40 
-21- .30 
ll - .20 
.01- .10 


KS NK WR OINIO oO 
Me bet wh ow 





Median 7 .73 .70 





lr Tetrachorics were computed for selected test items Rater A vs Rater B, and 
B vs C. An N of one hundred protocols was used. Items were selected on the 
basis of < 80-20 splits, minimizing content repetition. 


Discrimination between Fighters and Nonfighters. The significance of differences be- 
tween ratings (on the protocols of the 50 fighters and 50 nonfighters) was computed 
for each of psychologists A’s ratings (such as ‘‘Hero manifests considerable anxiety,” 
in Picture I). The same was done for the other psychologist raters. Psychologist A 
rated 20 items (out of 335) on the 50 fighters and 50 nonfighters which differentiated 
between the two groups at the .05 level of significance or better. Psychologist B had 
15 significant differences and psychologists C and D had only 13. These are near the 
chance level of expected differences. Moreover the agreement between raters as to 
which items differentiated fighters and nonfighters seemed to be quite low, with one 
rater’s most discriminating items often showing no difference for the other raters. 
Then the judgments of all of the psychologists were combined for the fighters and 
nonfighters on every item separately. The significances of the differences were again 
computed. This process makes significant differences less likely than the stated con- 
fidence level would indicate. Significant differentiations of the raters combined 
would tend to cancel out somewhat because of imperfect correlations between raters. 

Using these combined (or averaged) ratings of the psychologists, four differ- 
ences appeared significant at the .05 level of confidence. Two of these related to the 
perception of “Conflict between the son and parent” in Picture I (by the fighters 
significantly more often than by the nonfighters). This apparently contradicted the 
life history and interview findings, which showed less tension and conflict reported 
between fighters and their parents than for the nonfighters®). The third difference 
related to the item labeled ‘‘coherency, intelligibility and communicability of story” 
in which fighters received higher ratings. This is probably a reflection of their greater 
intelligence as indicated by other tests. 

And finally the prediction of which protocols belonged to which criterion group 
proved significant, 7.e. raters successfully predicted whether a protocol belonged to a 
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fighter or to a nonfighter. This judgment, of course, was based only upon global 
analysis of the projective material written by each subject. 

Although at only the .10 level of chance, four differences emerged. In picture 
three, showing the injured man being assisted by two other men, two themes pre- 
dominated for the fighters: “violence is done to the hero who is supported by friends” and 
“hero is hurt, but it is an accident, and is nobody’s fault’’. The nonfighters’ responses 
were “significantly”’ more often unclear, fragmentary or unusual. The quality of the 
grammar also was “significantly” different between the two groups, the grammar 
of the fighters being judged superior. 


Discussion 


This TAT procedure was one of 86 different tests and procedures given to each 
subject during each week’s assessment. The majority of the tests yielded a profusion 
of significant differences between fighters and nonfighters. The MMPI, for example, 
showed 32 scales out of 47, or about two-thirds discriminating significantly, and many 
of these far beyond the .01 level of confidence. However, one must realize many of 
these are highly intercorrelated. Yet on a test such as the C. L. Humor test where 
factors are minimally correlated, 8 out of 10 factors differentiated at the .05 level. 

The great body of supplementary data, including intensive life-history inter- 
views, yielded considerable knowledge about the personality dynamics and structure 
of the fighter as compared to the nonfighter®?. In the light of this information (re- 
ported elsewhere) the TAT-like projective test used in this study added little to our 
understanding and in one case was at variance with the more objective or direct type 
material gathered through personal interviews. That the TAT type of technique 
proved of limited value, here, of course, need indicate little about its value if it were 
used in another setting in which the traditional pictures and scoring methods are 
used. And it may well be that, if each protocol had been analyzed in a more wholistic 
fashion, and the scoring categories made possibly more gross, greater knowledge and 
insight concerning these men may have been achieved. 


SUMMARY 


Near the close of the Korean War 310 fighters and nonfighters were given a 
week’s assessment. This involved administering 86 separate tests and procedures 
one of which was a TAT-like picture projective test. Using a special scoring system, 
four psychologists independently analyzed 100 test protocols. Fair rater agreement 
was obtained with the median interrater correlation being .72. However differences 
between fighters and nonfighters were only at the chance level. This was in sharp 
contrast to such test instruments as the MMPI, Humor Test, clinical life history 
interview, etc., where a plethora of differences emerged. 
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A CRITERION MEASURE OF WITHIN-HOSPITAL CHANGE IN 
PSYCHIATRIC ILLNESS! 


M. H. GORDON, 8S. B. LINDLEY AND R. B, MAY 
Veterans Administration Hospital, Knozville, Iowa 


INTRODUCTION 


This report concerns one candidate for membership in a multiple criterion of 
within-hospital change in psychiatric illness. A method is proposed, and tested for 
objectivity and stability, for tracking in quantitative terms the progress of a patient 
from one location in the hospital to another. This movement is assumed to reflect one 
aspect of his change in psychiatric illness. 

Evaluating with appreciable confidence changes in psychiatric illness among so- 
called ‘“chronic”’ psychiatric patients has been a persistently vexing problem ©: P». 
334-338; 4, pp. 1-29) Release from the hospital has been considered a basis for classifying 
patients in the preferred division of a dichotomous improvement criterion. Relatively 
very small proportions of chronic patients leave the hospital, however; many have 
been hospitalized five, ten, even twenty and more years. This criterion fails, there- 
fore, to subdivide its less preferred class into a number of intervals commensurate 
with the relative size of its population. This criterion misses the less dramatic, less 
obvious, more gradual changes reported among patients who do not leave the hospi- 
tal. é 

Some chronic patients seem to get better, and some, worse, in different degrees. 
Psychological tests®: would be useful in detecting this variability, except that 
they fail to elicit scorable responses from many chronic patients. Behavior rating 
scales ‘*) can carry but a share of the burden and, like psychological tests, are relative- 
ly costly in man-hours. A criterion seems to be needed consisting of several or many 
variables, each quantified, based on, and acceptably relevant to, some external re- 
ferent of getting better-getting worse, each satisfactorily objective and stable, and, 
optimally weighted in combination, resulting in a measure both highly reliable and 
highly valid “: pp. 1261-1252; 3, pp. 1-2). 


PROCEDURE 


Each of the hospital’s “‘n-1’’ wards was indicated on a small white card, about 
two inches square; ‘‘m’’ judges were asked independently to rank the cards with 
respect to the ‘‘mental status of the average patient on the ward’’. An additional, 
highest rank was assigned to “‘trial visit or better,” z.e., leaving the hospital, it was 
hoped not to return. 

The judges were hospital staff members, representing a variety of viewpoints 
regarding patient care: physicians, nurses, psychologists, social workers, attendants, 
rehabilitation therapists. Each was concerned, furthermore, with some form of 
direct patient care and responsible for knowing the hospital’s patient population as a 
whole. The judges sometimes expressed dissatisfaction with the given basis for 
ranking the wards. Each judge was requested, however, to decide according to his 
own viewpoint. 

This procedure was followed in early October 1953 with m = 16 and n = 35; 
five months later in February 1954 with the same people ranking the same wards; a 
third time in July 1956 but with added judges, m = 24, and an additional ward, 
n = 36. 

The ranks for each year were converted by Guilford’s method to “scale values 
assuming a composite standard”’®: pp. 186-188, 193) then to mean of 50 and standard 
deviation of 10. ‘‘Morbidity’’ scores on the Lorr Multidimensional Scale for Rating 
Psychiatric Patients, Hospital Form“, were available for 905 patients who had been 
rated in the summer of 1956 and, at that time, residing on 25 of the 35 ranked wards. 
The relevance : »- 4°) of the 1956 ward ranking to these scores was studied. 


1From the Veterans Administration Hospital, Knoxville, lowa. Earlier results were reported in 
September 1954 to the Behavioral Analysis section of the joint meetings of the Psychometric Society 
and the American Psychological Association, Division Five. 
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RESULTS AND DIscUSSION 


Distribution Characteristics of the Scaled Values. The fit of each year’s distribution of 
scaled values to the normal curve was tested by Fisher’s ‘‘method of cumulants”’ 
(S, pp. 153-155) Symmetry, gi, was, for 1953, .674; 1954, .679; 1956, .654; 1956 less the 
ward omitted in previous years, .650. These amounts were accepted as insignificant 
at about the .10 confidence level. Kurtosis, go, was, for 1953, -.168; 1954, .123; 1956, 
.0704; 1956 less the ward omitted in previous years, .00353. These amounts were 
accepted as insignificant at greater than the .80 confidence level. 

It is inferred that, although these distributions follow quite closely the kurtosis 
of the normal curve, they tend to be positively skewed. This asymmetry probably 
reflects, first, the total consensus assumed for ‘‘trial visit or better”’, and second, the 
generally chronic condition of the patient population. The latter might be expected 
to lead to more consensus, hence more spread, where wards are judged ‘“‘better than’, 
in contrast to ‘‘as poor as”’ or ‘“‘worse than’’, the composite standard. 


Reliability. Inter-judge agreement for each year was measured by average rank inter- 
correlations, computed from concordances *: PP. 172-176; 11, pp. 278-286, 439) These cor- 
relations were: for 1953, .783; 1954, .756; 1956, .760; 1956 less the ward omitted in 
previous years, .762. All measures were significant beyond the .001 confidence level. 
It is inferred that the scale is based on judgments having acceptable ‘‘objectivity’’, 
in the sense of public agreement. 

The stability of this agreement from year to year was evaluated by partitioning 
chi-square for the obtained concordance“ »». 181-12). Discrepance chi-square, the 
sum of the three chi-squares for 1953, 1954 and 1956 less the ward omitted in pre- 
vious years minus chi-square for the pooled data for the three years, was extremely 
small and accepted as insignificant beyond the .999 confidence level. This consistency 
of amount of inter-judge agreement, even when the number of judges was increased 
in 1956, suggests that, unless a different basis for ranking were given to the judges, 
probably a limit has been reached. Rank order correlations between scale values for 
pairs of years ranged from .970 to .993; product-moment correlations, from .967 to 
.984. It is inferred that the scale is acceptably ‘‘stable”’ “ Pp. 1242-1243), 


Validity. The relationship for 25 wards between their 1956 location scale vis ana 
1956 Lorr scale morbidity scores was investigated by utilizing the morbidity values 
at the three quartiles of each ward’s distribution. This utilization seemed necessary 
because the morbidity score distributions tended to vary in shape from ward to ward. 
If the distributions did not so vary, then the correlations between location scale 
values and morbidity scores would be expected to be the same at each quartile. 

The three rank order correlations between location scale and morbidity score 
were —.714 at the first quartile, -.747 at the median, and —.732 at the third quartile. 
Since these correlations were about the same, it was inferred that the distributions 
were similar from ward to ward. Further evidence for this similarity appeared in the 
rank order correlations between morbidity quartile scores; these ranged from .83 to 
.93, all significant beyond the .03 confidence level. On the other hand, these correla- 
tions were not perfect; therefore, it still seemed necessary to examine the location 
scale-morbidity score relationship at each of the three quartiles. 

Since the correlations between location scale and morbidity score were signi- 
ficant at beyond the .05 confidence level, it is inferred that, when the wards are 
ranked as described in this study, they tend to fall in substantially the same prefer- 
ential order as when ranked on the basis of the detailed behavioral ratings of the Lorr 
scale. Perfect agreement between the two methods of ranking was missed by less 
than 14 per cent. This amount of agreement seems especially high in view of the 
following three facts: first, different patients were rated by different raters—thus, 
the reliability of the morbidity scores is not known; second, only 25 of the 35 wards 
were included in this sample; third, the Lorr scale data were collected without a 
view to research but independently as a clinical service to the hospital. 

The product moment correlation for the 905 patients between their 1956 loca- 
tion scale values and 1956 Lorr scale morbidity scores was —.358, significant beyond 
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the .01 confidence level. It is inferred that, assuming the 905 patients to represent a 
sample selected at random from all such patients, the location scale acceptably dis- 
criminates among different levels of the ‘‘morbidity”’ assessed by the Lorr scale. The 
amount of this relationship falls far short, however, of what would be required for 
individual prediction: standard deviation of location scale values was 8.81, and 
standard error of estimate of these values from morbidity score was reduced only to 
8.23; standard deviation of morbidity scores was 14.3, and standard error of estimate 
of these scores from location scale was reduced only to 13.3. 

Correlation ratio of location scale value on morbidity score was .380; of morbid- 
ity on location, .407. Both ratios were significant beyond the .01 confidence level. 
The latter ratio differed significantly (beyond the .01 confidence level) from the 
product moment correlation, indicating that a non-linear function could be found 
which would increase the prediction of morbidity scores from location scale value. 
The size of this non-linear relationship would still, however, seem far short of re- 
quirements for individual prediction. 

Finally, the scale has been assumed to hold its own validity ‘‘by definition” 
(1, p. 124) The assignment or reassignment of a patient within the hospital represents 
a composite judgment, a serious, practical, medical staff decision which is based upon 
all the pertinent information available about both patient and hospital at the time of 
proposed movement. Staff members responsible for proposing and deciding upon this 
movement have often expressed ideas, not only of qualitative-like differences in the 
locations—for example, “This is a ‘disturbed’ ward, but that is a ‘quiet’ one’’—but of 
differences in degree of psychiatric illness—for example, ‘‘The patients on this ward 
are in better shape than the patients on that ward.” 


The Scale Values. Different hospitals would need, of course, to have their own scales. 
The method described herein for obtaining scale values, however, seems general. 
From time to time, within a hospital, the setting may change so that rescaling based 


on reordering of wards would be necessary. 

The scale values and their distribution refer to locations, not patients. Thus, 
when the values are assigned to patients, statistical treatment would follow the dis- 
tribution of values assigned to the patients and the assumptions underlying the 
sampling of the patients. 

SuMMARY 


A composite ranking of wards at this hospital has resulted in a device for helping 
to evaluate change in psychiatric illness within a chronic patient setting. The scale is 
evidently based on judgments having acceptable objectivity, stability, and “face” 
relevance to the local hospital setting. The scale is shown to be clearly relevant to 
what is assessed by the Lorr scale’s morbidity scores. The relatively modest level of 
this relevance, though statistically highly significant, suggests that the two scales 
might be usefully combined, each contributing different information in a multiple 
criterion. 
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A STUDY OF FUNCTIONAL RELATIONSHIPS AMONG MEASURES OF 
ANXIETY, EGO STRENGTH AND ADJUSTMENT 


EARL J. ENDS! AND CURTIS W. PAGE” 
Willmar (Minnesota) State Hospital 


PROBLEM 


The present study investigates several measures of anxiety, ego strength and 
adjustment on a single population of reasonable heterogeneity regarding psychologi- 
cal characteristics and degree of pathology and in which both favorable and unfavor- 
able personality changes are known clinically to have occurred ©). 

Specifically, we propose to examine correlations pre and post therapy among 
the following measures: the Taylor Anxiety Scale (Ta); the Freeman Manifest 
Anxiety Scale* (FMA); the raw psychasthenia scale (Pt) of the MMPI; the de- 
pression scale (D) of the MMPI; the Barron Ego Strength Scale (Es), a derivative of 
the MMPI"); the Dymond Adjustment Scale (DA) for Q sorts“); the self: ideal- 
self Q sort score (SI) ®; and the self-criterion Q sort score (SC). The latter is a 
scale similar to the DA in purpose and derivation but takes account of the weight- 
ings of the specific items. Further, we shall examine these conceptually related 
measures for functional relationships as measured by the correlation of changes 
(difference scores) from pre to post therapy. Or, in other words, does concomitant 
variation occur among the measures? 


PROCEDURE 
The 63 subjects were male alcoholics within the age range of 25 to 45; mean age 
was 36.92 years, SD 5.88 years. All subjects achieved scores above the 40th percent- 
ile on the Army General Classification Test (AGCT); the mean AGCT score was 


115.10, SD 10.38. All were hospitalized for at least a 60-day period. 

The MMPI, the FMA, and Q sort were administered six weeks apart, before 
and after 15 sessions of group psychotherapy. The Ta, Pt, Es, and D scores were all 
derived from the MMPI; the DA, SC and SI scores were derived from the self and 
ideal-self Q sorts. SC and SI scores were computed by transforming correlations to 
Fisher z’s and multiplying by 1C0. 


RESULTS AND DiscussION 


After first examining the distributions for range and linearity, product moment 
correlations were computed for: (1) all combinations of pre scores; (2) all combina- 
tions of post scores; and (3) all combinations of difference scores. These data are 
presented in Table 1. With the exception of the FMA, all relationships are in the 
expected direction. Functional correlations among the variables are generally weak- 
er than the pre and post correlations. Some attenuation is to be expected because 
difference scores are likely to be more influenced by unreliability of the measures 
than are single scores. In some cases, the correlations are high enough to suggest 
that the variables are indeed functionally related. 

The FMA gives little evidence of a clear-cut systematic relation to any of 
the other variables except Es, in which instance a functional relationship did not 
occur. The correlations between Pt and Ta are identical to those found by other 
investigators ®: ©); their correlations with all the other variables are remarkably sim- 
ilar and suggest that they do indeed measure essentially the same thing. Anxiety as 
measured by Pt and Ta appears to bear a definite functional relationship to de- 
pression (D). Pt and Ta also bear a functional inverse relationship to SC, SI, DA, 
and Es, but Ta has the stronger relationship in the cases of SC and DA. Since the 

1Now at Dept. of Psychology, University of Denver. 
2Now Chief Clinical Psychologist, Osawatomie State Hospital, Kansas. 
§Now entitled ‘The Freeman Anxiety Neurosis and Psychosomatic Test.” 





A STUDY OF FUNCTIONAL RELATIONSHIPS 


TaBLE 1. CoRRELATIONS OF PRE, Post, AND DIFFERENCE ScorES BETWEEN 
Eacu OF THE MEASURES 
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' For an N of 63 to differ significantly from zero at the 5% level an r must be + .25; at the 1% 
evel + .32. 


latter two measures were specifically developed to assess “adjustment” or ‘‘mental 
health’’ and have been used with some success, it appears that the Ta has a slight 
advantage over Pt in tapping those aspects of anxiety that are most likely to change 
in short term therapy. Further, anxiety as measured by Ta seems to be more closely 
associated with the measures of adjustment (SC, DA) than with either self-satis- 
faction (SI) or ego-strength (Es) though differences are only suggestive. 

The SC and DA also show considerable similarity in terms of degree of relation- 
ship to the other variables. They, of course, represent different ways of scoring the 
same set of items. The agreement between them is the highest in the entire analysis 
but was not expected for the reason that SC is a judged optimal configuration using 
nine category weights whereas the DA ignores configuration and is scored simply on 
whether given statements are placed on the “correct” side of the distribution. The 
implication of this finding is that for a given DA score there exists a high degree of 
similarity in degree of deviation of specific statements from a judged healthy pattern, 
or stated more positively, there probably exists strong similarity between configura- 
tions of Q sort statements for persons of a similar level of adjustment as measured 
by the DA. However, SC bears a much closer functional relationship (inverse) to 
D than does DA. As might be expected from Block and Thomas’ results®, both 
bear a strong functional relationship to self-satisfaction (SI), but neither seems close- 
ly related to Es. SC appears to have an advantage over DA as a measure of adjust- 
ment because it is about equally sensitive to changes in Ta and D, while DA is 
sensitive to changes in Ta but not in D. 
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The Es score shows the expected relationships to all variables used, but not in a 
functional sense. Only in the cases of Pt, Ta and D, are the functional relationships 
significant. If Es is measuring ego strength, one of course would not expect it to be 
closely related to personality changes associated with short term therapy. There is a 
clue in its correlation with the anxiety and depression measures that it may be 
measuring defenses against these conditions. 

In terms of consistency between pre and post correlations, what look like system- 
atic changes between the pre and post correlations appear in some cases. In general, 
the adjustment measures (SC, DA, SI) show a lower correlation with affect or symp- 
tom measures (Ta, Pt, D) on post-therapy testing than on pre-therapy testing. The 
Es seale on the other hand, has a higher correlation with the above measures follow- 
ing therapy. The relationship between Es and the adjustment measures appears 
to be a relatively constant one. Ta and Pt correlations with Es tend to be higher on 
post testing (when correlation between pre and post testing is ignored). Pt also shows 
significantly lower correlations with SC and DA on post testing. The DA - SI cor- 
relation is significantly higher on post testing. 


SUMMARY 


In an attempt to determine whether a dynamic functional relationship exists 
between measures which should logically bear such a relationship if they are indeed 
measuring what they purport to measure, pre, post, and difference scores were cor- 
related between each of three measures of manifest anxiety, two measures of ad- 
justment, one of ego strength, and one of self-satisfaction. These scores were de- 
rived from tests given before and after fifteen sessions of group psychotherapy with 
63 hospitalized alcoholic patients. 

Findings suggest that functional relationships between measures developed to 
assess theoretically related psychological constructs do exist between some scales 
but not between others even though scores are correlated in the usual sense. Further, 
the closeness of the functional relationship with a third may vary considerably be- 
tween two tests designed to measure the same variable, indicating that considerable 
caution must be used in inferring functional relationships on the basis of high cor- 
relations between single measures of two variables. 

There is also some evidence that the relationship between the measures of 
affect and the measures of adjustment and ego strength involved is not consistent in 
degree but suggests the possibility of a systematic change in the strength of the 
relationships which may be presumed to be a res: ‘i of therapy. 
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RELATIONSHIP OF CORNELL MEDICAL INDEX RESPONSES 
TO POSTSURGICAL INVALIDISM! 


MORTON BARD AND SHELDON E. WAXENBERG 


Sloan-Kettering Institute for Cancer Research, New York City 


PROBLEM AND METHOD 


In the course of an intensive psychological study of 20 white female patients, 
aged 28 to 58 years and without known psychiatric or surgical histories, who under- 
went radical mastectomy for carcinoma of the breast“: *’, each patient was inter- 
viewed at length and given the Cornell Medical Index Health Questionnaire “®? the 
day before surgery and again between six and ten weeks after surgery. The initial 
interview was the basis for a multiple clinician ranking of the patients on the trait of 
dependence; and the final interview, for ranking them on the trait of psychogenic 
invalidism. In a previous publication”, the correlation between the dependence 
rankings and the invalidism rankings was found to be .50 (p < .05). The CMI 
scores are here analyzed for clues indicating how this quickly answered self-admin- 
istered questionnaire could serve at the time of diagnosis of malignancies as a means 
of screening patients predisposed to psychologically determined invalidism reactions 
in order that they might receive preventive psychiatric attention before under- 
going surgery. 

The first 107 questions of the CMI have to do with different organ systems and 
will be referred to here as the Physical Symptoms section. The next 37 questions 
dealing with fatiguability, frequency of illness, miscellaneous diseases, and habits, 
will be referred to as the General Medical section. The final 51 questions, referred 
to as Mood and Feeling section, relate to inadequacy, depression, anxiety, sensitiv- 
ity, anger, and tension. Although it has been found that the number of bodily and 
psychological complaints affirmed on the CMI is correlated positively with age, 
the average numbers of such complaints for women in the age ranges 26 to 35, 36 to 
45, and 46 to 55 years are almost identical, and all of the breast cancer patients con- 
sidered here were within or very near to this over-all range of ages. 


RESULTS 


The average total number of positive responses for 19 patients (one patient 
having refused the second CMI) decreased from 20.5 preoperatively to 16.5 after 
surgery. The mean difference was not significant by at test. The average number 
of affirmative responses in the Physical Symptoms section fell from 11.4 to 8.8, in 
the General Medical section, from 3.5 to 3.3, and in the Mood and Feeling section, 
from 5.5 to 4.4. The rank correlation between the pre- and postsurgical total CMI 
scores, the number of yes answers indicating personal affliction or involvement, was 
.67 (p < .01). Only 3 patients shifted from a higher to a lower score category and 2 
from a lower to a higher score category when dichotomized on the basis of ranks 1 to 
10 being regarded as higher and ranks 11 to 19 being regarded as lower. The chi 
square for these changes was not significant. For the Physical Symptoms scores 
r = .54 (p < .05); for the General Medical scores r = .70 (p < .01); and for the 
Mood and Feeling scores r = .88 (p < .01). The CMI response patterns thus appear 
to be fairly consistent for this patient group despite the intervening surgery. 

In the study of these patients, dependence was defined as a basic personality 
component expressed in the solicitation of support and protection from those re- 
garded as powerful and nurturant and by the extent to which a patient has been 
unable to free herself from maternal control. The preoperative total CMI scores and 
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the three subscores, after ranking, did not correlate significantly with the rankings of 
the patients on the dependence measures, which were derived from judgments of 
the preoperative interview material by several clinicians. Nor did the postoperative 
—— score rankings, total or subsection, correlate significantly with the dependence 
ranks. 

However, preoperative total CMI score ranks correlated .49 (p < .05) and the 
postoperative total CMI scores correlated .52 (p < .05) with the postsurgical in- 
validism ranks which were based on evidence of depression, weakness, and low levels 
of social participation and motor activity in postoperative interviews. In both CMI 
administrations, the Physical Symptoms subsections correlated highest with the 
invalidism rankings: preoperatively, r = .57 (p < .01) and postoperatively, r = .60 
(p < .01). The General Medical subsection correlations with invalidism rankings 
were .44 (p < .05) and .42 (p < .05). The Mood and Feeling subsection did not 
correlate significantly either time. 

Use of the preoperative total CMI scores to predict the ten least and the ten 
most invalided patients would have resulted in 12, or 60%, correct classifications 
out of the 20. Use of the Physical Symptom subscores alone would have given 14, or 
70%, correctly classified. The General Medical and the Mood and Feeling subscores 
would have given 11 and 10 correct, respectively. These percentages are not sig- 
nificantly different from those which might be yielded by blind selections. 

When the predictive classifications are made on the basis of a pooling of pre- 
operative CMI total score ranks with dependence ranks (r = .63; p < .01), 8 of the 
more invalided and 8 of the less invalided patients are correctly categorized, giving 
an accuracy in prediction of 80%. However, the dependence rankings were devised 
for a specific research and required the careful appraisal of interview protocols and 
the judgments of several highly trained clinicians. This greater predictive power 
of the pooled rankings would depend on the complex collaborative efforts of at least 
several psychologists who are unlikely to be available in most surgical hospitals for 
screening surgical risks. 

The authors of the CMI have stated, variously, that a score of more than 25 ora 
score of 30 or more positive responses to the questionnaire indicates serious disorder 
in the patient™. If a score of more than 10 positive responses on the total pre- 
operative CMI is considered critical for indicating a tendency toward an above 
average degree of postsurgical invalidism, 12 of the 20 patients, 60%, would have 
been correctly categorized. With the critical score set at more than 15 positive res- 
ponses, 13 patients, 65%, would have been screened correctly. With the critical 
score set at more than 20, 25, and 30 responses, respectively, correct classifications 
would have been achieved in 12, 11, and 12 instances, respectively. Thus, the fact 
of exceeding a score of 25 or 30 on the CMI before surgery does not serve very effect- 
ively for selection of patients who manifest a greater degree of psychogenic in- 
validism. 

Although preoperative CMI scores would be the most valuable aids to compre- 
hensive treatment planning, were they predictive, the usefulness of the postsurgical 
CMI scores has also been investigated. Using the postoperative total CMI scores to 
classify the 9 most invalided and 9 least invalided patients, eliminating the tenth 
invalidism ranking of 19 cases because of various ties in CMI scores, gives 14 correct 
out of 18 cases, or 78% accuracy. Using, separately, the 3 subsections of the CMI, 
13, 13, and 9, or 72, 72, and 50%, of the patients, respectively, would have been 
correctly categorized. 

Using as cutoff points scores on the postoperative total CMI of more than 10, 
15, 20, 25, and 30 positive responses, the number of correct classifications of the 9 
more and the 9 less invalided patients was 12, 12, 11, 9, and 8, respectively, out of 18, 
or 67, 67, 61, 50, and 44% accuracy. In none of these instances is the result sufficient- 
ly good to establish the postoperative CMI as a corroborative clinical tool for in- 
dicating poor psychological reactions to surgery. 
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DIscUssION 


Although the authors of the CMI regard failure to respond at all to a question 
as worthy of clinical attention), this criterion did not serve well as a critical sign 
insofar as invalidism reactions in this patient group were concerned. The selection 
of an abbreviated list of discriminating items for screening surgical] patients suscept- 
tible to invalidism reactions was found to be complicated by the widespread opera- 
tion of the defense of denial on one hand and the subtle assertion of hypochondriacal 
trends in peripheral complaints on the other hand. For example, the 3 patients rank- 
ing highest on invalidism, and only they, complained of frequent severe toothaches. 
Of the 19 women, only 3 after removal of a breast admitted being troubled with a 
serious bodily disability or deformity, and all of these ranked fairly high on invalid- 
ism. The 5 patients who did not after surgery admit ever having a serious operation 
and the 7 who did not admit ever having been treated for a tumor or cancer were 
mostly low ranking on invalidism and were patently practicing denial. Perhaps 
effective exercise of the mechanism of denial, with respect to serious illness, radical 


surgery and bodily disability and deformity, plays a part in reduction of invalidism 
reactions. 


SUMMARY 


The CMI was investigated as a device for screening patients who might mani- 
fest severe invalidism reactions after surgery. Twenty women were intensively inter- 
viewed and tested before and after undergoing radical mastectomy. Rank correlation 
between pre- and postoperative total CMI scores was .67 (p < .01). Both pre- and 
postoperative total CMI scores correlated with invalidism rankings based on clinical 
judgments of postoperative interview material (r = .49, p < .05;r = .52, p < .05). 
Physical Symptoms subtotals correlated most highly with invalidism, and Mood 
and Feelings subtotals, taken alone, did not correlate significantly. Preoperative 
total CMI score rankings gave 12 correct classifications out of 20 in predicting the 
10 least and 10 most invalided patients. Use of Physical Symptoms subscores alone 
gave 14 correctly classified, and the General Medical and the Mood and Feeling sub- 
scores gave 11 and 10 correct, respectively, none significantly different from chance. 


Preoperative CMI cutoff scores did not effectively differentiate the more invalided 
patients either. 
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THE VALIDITY OF SHOBEN’S PARENT ATTITUDE SURVEY 
JESSE E. GORDON 
University of Wisconsin 


PROBLEM 


This paper reports some data on the validity of Shoben’s University of Southern 
California Parent Attitude Survey (USC) purporting to measure attitudes toward 
children. Shoben validated his scale by the disparate groups method. The items 
finally selected for inclusion in the scale were those which discriminated between 
parents of problem children and parents of non-problem children. Responses were 
weighted differentially on the basis of the data collected from these groups. 

The writer has had a relatively unique opportunity to compare scores on the 
USC survey with ratings of the behavior of mothers toward their children, as this 
behavior was observed intensively over an extended period of time. The Pennsyl- 
vania Society for Crippled Children and Adults sponsors an annual 12 day program 
for pre-school deaf children and their mothers. Mothers and children live together in 
a camp setting for the duration of the program, and are constantly in contact with a 
professional staff of eight to ten speech therapists, nursery teachers, psychologists, 
and audiologists. Parents, children, and staff live, dine, attend daily meetings, and 
participate in nursery school and speech therapy sessions together. The constant 
interaction of parents, children, and staff over a 14 hour day, for 12 days, provides a 
situation in which the attitudes and behaviors of the mothers toward their children 
can be observed by a sophisticated staff on a very wide variety of occasions. Since 
the primary purpose of the speech and hearing program is diagnostic, daily staff 
meetings were held in which each parent-child pair was evaluated and discussed, 
and the observations of staff members collated and integrated with psychological 
test results. This provided an opportunity for all relevant information to be com- 
municated to all staff members. 


PROCEDURE 

Shoben’s USC surveys were sent to the mothers who planned to attend the 1955 
program two weeks before the beginning of the program. Twenty mothers parti- 
cipated, and returned the completed tess. The mothers came from all sections of 
Pennsylvania and appeared to be representative of the population of the state. The 
speech and hearing program is the only one of its kind in the state, and for several 
states around; it is non-profit, and cost-free for needy persons. Thus there do not 
appear to have been any significant selective factors operating, except that the child 
be of pre-school age and have at least a moderate hearing loss, regardless of etiology 
of the loss. It is improbable that the deafness of the children should have produced 
any distortion in the ability of the Shoben scale to measure mothers’ attitudes. 

The staff consisted of a director with extensive experience as a supervisor in a 
large university speech and hearing clinic, a chief audiologist, regularly employed by 
a university hospital who has specialized in work with children and parent counsel- 
ing, two teachers of deaf children, two nursery school teachers, three psychologists, 
and six graduate students in speech therapy. All of the graduate students had had 
some training in child psychology and counseling as an important component of 
their training in speech therapy. 

At the conclusion of the 12 day session, each staff member, excluding the writer, 
was requested to rank the mothers on how likely it was that the mother’s attitudes 
and behavior toward the child would result in the child becoming a “problem child”’. 
All 13 staff members made their rankings independently. The same procedures were 
repeated for the 1956 session, in which 21 mothers participated. The staff for the 
1956 session was essentially the same, except that the graduate students were differ- 
ent people, and fewer in number. Rankings by 11 staff people were obtained in 1956. 

The children themselves appeared to reflect a range of adjustments from ex- 
treme negativism and hostility in one child, fairly marked anxiety and fearfulness in 





THE VALIDITY OF SHOBEN’S PARENT ATTITUDE SURVEY 155 


another, and signs of overdependency in three children to healthy adjustments in- 
cluding positive relationships with other children and with parents. This range of 
adjustment may be reflected in the range of discrepancies between mental age and 
social maturity age scores. These discrepancies ranged from a social age 5 months 
higher than mental age to a social age 26 months lower than mental age. The mean 
was a social age 6.3 months lower than the mental age. The size of this discrepancy 
may be accounted for by the items on the Vineland Social Maturity Scale referring 
to speech, which this population of children could obviously not pass. The standard 
deviation of the discrepancies was 7.5 months. The evidence suggests, therefore, 
that there was a sufficient range of levels of adjustment. 

There was also a wide range of attitude patterns in the mothers, from outright 
rejection and hostility through all stages of expressed ambivalence to very great 
overprotectiveness and dependency on the child. Some of the mothers’ attitudes 


were deemed so pathological that strong referrals were made to social agencies in the 
home areas. 


RESULTS 

Coefficients of concordance were computed among the rankings by the staff 
members. For the 1955 data, W = .58 and for 1956, W = .69. Both coefficients are 
significant at less than the .001 level, indicating well above chance agreement among 
the staff members in both sessions. Kendall’s Tau coefficient of agreement between 
two sets of associated ranks was computed between each mother’s mean rank, as 
assigned by the staff, and her rank on Shoben’s scale. Taus were computed separately 
for each year for the total Shoben score and the Dominating, Possessive, and Ignor- 
ing subscales. Table 1 reports the coefficients for both years, and indicates that no 


Taste 1. Tau CorErricrents BETWEEN Moruers’ USC Survey Scores aND 
Srarr RANKINGS 








Staff Ranking | Possessiveness Dominance Ignoring Total Score 








1955 | -.01 04 .05 04 
1956 | 16 -.01 OL -04 
| 





significant agreements were found between staff rankings and the Shoben scale of 
attitudes toward children. Although these findings do not directly invalidate the 
test, they do suggest that further cross-validation is required before the scale can be 
used in the clinic with confidence. 

Some further inferential evidence regarding the validity of the USC survey may 
be derived from data regarding the social maturity of the deaf children. It may be 
reasoned that where parental attitudes are similar to those which produce problem 
children, the child should evidence social immaturity. Doll’s Vineland Social Matur- 
ity Scale was administered for all the children, with the mothers as informants. In 
order to control for the children’s intelligence, discrepancies between the obtained 
Social Quotients and the children’s intelligence quotients, as derived from Merrill- 
Palmer performance, were computed. These discrepancies were ranked, and Taus 
computed between the mother’s rank on the USC survey and her child’s discrep- 
ancy rank. For the 1955 group, the obtained Tau of -.11 was not significant, and 
for the 1956 group, a Tau of -.33 was obtained, which is significant at the .05 level. 
However, the negative relationship implies that the lower the mother’s USC score 
(which is the “healthy” direction), the lower is the child’s social maturity in relation 
to his intelligence. Although the 1955 and 1956 data are inconsistent, neither sup- 
ports the validity of the USC survey. 


Discussion 


It should be noted that in Shoben’s original standardization procedures, he 
found that the responses which a group of clinical psychologists felt constituted ideal 
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attitudes toward children were significantly different from those responses given by 
mothers of the non-problem children. However, these “‘ideal’’ responses were much 
closer to the non-problem group’s responses than they were to the problem group’s. 
The data presented here indicate that mothers’ scores on the scale do not agree with 
how a group of sophisticated observers evaluate their behavior toward their children. 
If the validity of the scale is assumed, it must be concluded that not only the test 
responses of experts, but also their judgments of observed behavior are not consistent 
with Shoben’s empirical findings regarding ‘“‘good’’ maternal attitudes. On the other 
hand, one may consider the observers’ judgments to be an adequate criterion for the 
validity of the scale. The agreement among relatively numerous raters in two differ- 
ent groups, and the extensiveness and intensiveness of the periods of observation 
argue in favor of considering these judgments as a reliable and adequate criterion. 
If one grants this, then our data are evidence against the validity of the survey, at 
least where ‘“‘problem children” are not necessarily and narrowly defined as those 
cases in which parents or juvenile authorities make active complaints about the 
behavior of the children, as Shoben defined them. The inferential evidence from the 
data regarding social maturity supports the interpretation of lack of validity. 


SUMMARY 

Rank correlations between expert observers’ judgments of attitudes of mothers 
toward their children, as these attitudes were inferred from extensive observation of 
the mothers’ behaviors toward their children over a 12 day period, and mothers’ 
scores on Shoben’s University of Southern California Parent Attitude Survey were 
found to be non-significant. Evidence derived from a social maturity measure of the 
children indicates that mothers’ scores on the survey in a “healthy” direction may 
be associated with social immaturity in their children, relative to the childrens’ in- 
tellectual status. The data appear to be best interpreted as suggesting a lack of 
validity of the attitude survey. It is suggested that Shoben’s validation procedures 
may have involved too narrow a definition of problem children for the scale to be 
used with confidence in clinical practice. 
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AN EVALUATION OF THE CONSTRUCT VALIDITY OF BARRON’S EGO- 
STRENGTH SCALE! 


ARTHUR 8S. TAMKIN 
Veterans Administration Hospital, Northampton, Massachusetts 


PROBLEM 

Barron introduced an Ego-Strength Scale (Zs) derived from the MMPI which, 
although initially developed for predicting response to psychotherapy, was consider- 
ed later as a measure of ego strength because of its item content and its personality 
and intelligence test correlates“. Wirt, using hospitalized psychiatric patients as 
subjects, also obtained confirmation of the scale’s ability to predict improvement 
from psychotherapy; and Quay‘* found that it differentiated between a group of 
hospitalized females of mixed psychiatric diagnoses and groups of nurses and at- 
tendants. An analysis of the scores of the various diagnostic categories within the 
patient group was not done, however. Although the demonstrated correlates of this 
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scale provide some presumptive support for the contention that the scale measures 
ego strength, defined rather broadly by Barron as a “general factor of capacity for 
personality integration”’, this study undertakes to evaluate further its construct val- 
idity by exploring its relationship to other reputed measures of ego strength and to 
degree of psychopathology. The two criterion measures of ego strength were the 
Rorschach®? F+% and Pascal and Suttell’s scores from the Bender-Gestalt ©, 
selected because of the claims made for them as measures of ego strength, and because 
of the availability of published validity data. As measures of degree of psychopath- 
ology, the Critical Item (CI) “ and F Scales of the MMPI were used as well as psy- 
chiatric diagnoses. The CI Scale was used because clinical experience suggests that 
it bears a close relationship to severity of mental illness. 


METHOD 


The subjects were 30 male, psychiatric patients at the Veterans Administration 
Hospital, Northampton, Massachusetts. Since the method of selection for this study 
was assignment for routine psychodiagnostic evaluation in the order of arrival at the 
hospital, except for three chronic cases, they represent a random sample of psychia- 
tric patients who are either newly admitted or readmitted. The Ss were divided into 
two groups closely equated for age, education, and intelligence for comparison on the 
scales on the basis of established diagnosis. One group consisted of 15 psychotics and 
the other of 15 psychoneurotics and personality disorders. The age, education in 
school grades, and intelligence in Wechsler-Bellevue IQ points for the psychotic 
group were 32.9, 9.8, and 103.6, respectively; and for the group of psychoneurotics 
and personality disorders, 36.1, 9.2, and 105.6. Each S was administered the MMPI, 
from which scores on the Zs and Critical Item Scales were computed. For 29 Ss 
a score on the F Scale was also computed. In addition to the MMPI, 25 Ss were given 
the Rorschach, and 26 the Bender-Gestalt, and scores were computed for F+% by 
means of Beck’s system and for the Bender-Gestalt by means of Pascal and Suttell’s 
system. First an intercorrelation matrix of all the measures was constructed to de- 
termine the relationship of the E's Scale to the other measures, and their interrelation- 
ships. Then, by means of t tests, each measure’s power of discrimination between the 
two diagnostic groups was ascertained. 


RESULTS 

The matrix of intercorrelations is presented in Table 1. There is no significant 
correlation between Fs and the other reputed measures of ego strength, F+% and 
B-G, nor are these latter two measures significantly intercorrelated. It may be in- 
ferred, then, that these three instruments are not measuring similar personality at- 
tributes. However, the two measures of general psychopathology, the F and CI 
Scales, are intercorrelated at a highly significant level, and are inversely related to 
the Es Scale at the .05 and .01 levels, respectively. 


TaBLe 1. INTERCORRELATIONS AMONG THE MEASURES 








Measures | F+% B-G 
Es | 18 01 








F+% | 13 
B-G 
F 























*Significant beyond the .05 level. 
**Significant beyond the .01 level. 


An analysis was made to determine the extent to which these measures differ- 
entiate between two degrees of psychopathology, defined by a diagnosis of psychosis 
on the one hand, and psychoneurosis and personality disorder on the other. While 
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F and CI yielded ¢ tests significant at the .025 point, Zs, B-G, and F+% did not 
yield statistically significant ¢ tests at the .05 point. 

The meaning of the lack of significant correlation of Es with either F+% or 
B-G is attenuated by the finding that the latter two measures are themselves not 
significantly intercorrelated, a confirmation of the results of Curnutt, et al.@’. In 
addition, the demonstration that neither shows a substantial relationship to F or CI 
nor to diagnosis raises a serious question as to their suitability for validating criteria 
of ego strength. Perhaps a more serious criticism of the claim to validity is the fail- 
ure of Es to differentiate between the two diagnostic groups, in spite of its good re- 
lationship to two test measures of general psychopathology. This is a particularly 
striking criticism since one group was almost exclusively composed of schizophrenics, 
for whom ego strength is considered to be extremely low. Since intelligence has been 
considered a component of ego strength, and since IQ data were available, the cor- 
relation of IQ with Es was evaluated. The obtained r was —.16, a statistically in- 
significant value. This finding is markedly inconsistent with previously reported 
coefficients of .21 using Shipley-Hartford IQ “ and .44 using Wechsler-Bellevue IQ 
“) for groups composed of psychiatric patients considered amenable to treatment 
by psychotherapy. 


SUMMARY AND CONCLUSIONS 


In order to evaluate further the construct validity of Barron’s Ego-Strength 
Scale as a measure of ego strength, its correlation with other reputed measures of ego 
strength and general psychopathology was explored, as well as its ability to differ- 
entiate degree of pathology determined by psychiatric diagnosis. Subjects consisted 
of 30 hospitalized psychiatric patients divided on the basis of diagnosis into a group 
of 15 psychotics and 15 psychoneurotics and personality disorders. The measures of 
ego strength compared with the Es Scale were F+% and the Bender-Gestalt, and 
the measures of psychopathology were the F and Critical Item Scales of the MMPI. 
The following conclusions were drawn from this study: 

1. The absence of significant intercorrelations among Es, F+%, and B-G in- 
dicates that these three reputed measures of ego strength are not measuring the same 
personality attributes, and the question of the validity of Zs cannot be answered 
with reference to these correlation data. Indeed, the unrelatedness of F+% and 
B-G in addition to their inability to discriminate degree of psychopathology sug- 
gests that they themselves may be inadequate measures of ego strength. 


2. Although the significant interrelationship among Fs, F, and CI is encourag- 
ing for the question of validity, the finding that Es does not discriminate degree of 
psychopathology as determined by diagnosis constitutes a serious challenge to its 
validity when applied to hospitalized psychiatric patients. 

3. The absence of significant correlation between Fs and IQ, which is incon- 
sistent with previous findings with psychiatric patients probably of less severe 
pathology, is another challenge to the claim of general validity. 
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DIFFERENTIAL CLASSIFICATION OF HEBEPHRENIC AND PARANOID 
SCHIZOPHRENICS FROM CASE MATERIAL! 
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PROBLEM 


Research workers in the area of behavioral pathology are frequently confronted 
with the problem of selection of nosological groups. Unfortunately, much experience 
has accumulated indicating lack of reliability of the standard nosology of mental 
disorders. Noyes‘: »- 6) commenting on the artificiality of contemporary classi- 
ficatory schemes states, “‘. . . efforts to classify schizophrenic patients into specific 
categories have been relatively fruitless and the present tendency is not to attempt 
the division.’”” Cameron, too, has remarked that “. . . all current attempts at classi- 
fication of functional personality disorders are unsatisfactory’, but that the most 


profitable classification schemes, ‘‘. . . are those based upon clinical and behavioral 
differences.”’ “ ». 879) 


Recently, Schmidt and Fonda“ have suggested that unreliability of diagnoses 
of mental patients may be something of a straw man. Their findings indicate that 
even naive observers can achieve a high reliability of diagnosis when groupings are 
over large categories, i.e., schizophrenic vs. character disorder vs. chronic brain syn- 
drome. Howeve’, within schizophrenics marked differences have been reported in the 
literature between, for example, the paranoid and hebephrenic patient ®: 4: 8 % 1, 12), 
The clinical researcher is frequently forced to the practical task of differentiating 
subcategories of schizophrenic patients before his data become meaningful. In this 
area, therefore, the reliability of hospital labeling remains crucial. 


Various rating schemes: '*- *) have been developed to meet this problem, but 
unfortunately most entail the highly time-consuming task of face-to-face contact 
with the patient. In designing a research project to study organizational verbal 
responses in various nosological groups®?, the author’s attempt at resolution of 
these problems resulted in the devising of a behavioral criteria system utilizing as 
raw data only information in the case record. 


MetuHop oF CLASSIFICATION 


The basic reference employed in the compilation of this criteria list was the 
Diagnostic and Statistical Manual of Mental Disorders“ of the American Psychiatric 
Association. The goal was a classical description which would coincide with continu- 
ing psychiatric practice. Furthermore, the attempt was for a utilitarian format 
which would also remain catholic. To this end, the material contained in the Manual 
was interpreted and elaborated sufficiently for reliable differences to be made, sup- 
plemented in part with such classical descriptive references as Henderson and 
Gillespie®) and Noyes“. These elaborations were meant to be representative of 
the diagnostic statements to be found in actual patient case records. In order to 
place a patient as belonging to either group all of the listed criteria must obtain 
except in those instances where several factors are listed as alternates for each other. 


‘Subproject of a doctoral dissertation in the Department of Psychology at the University of Con- 
necticut, June, 1956. The study was completed while the author was on the staff of the Norwich State 
Hospital, Norwich, Connecticut. Grateful acknowledgment is made for help given by Drs. Margaret 
M. Siem Weston A. Bousfield, Hermann O. Schmidt and James M. Sakoda. Drs. Howard Friediaan 
and Edward L. Siegel read the manuscript and made many helpful suggestions. 
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Diagnostic Rating Criteria 
I. Paranoid Schizophrenia II. Hebephrenic Schizophrenia 


1. Bitter feelings of being wronged—of be- 1. At least one bizarre, outlandish or im- 


2. 


ing the recipient of gross injustices.? 


Some attempt at convincing others of 
the validity of his beliefs. 


. Superior, condescending or assertive, 


dominating behavior, e.g., ‘“‘snotty’’ or 
“bossy” attitude. 


. Unverified and improbable beliefs, but 


not extremely bizarre or impossible ones. 
Alternative explanations are not enter- 
tained. 


a. Included are such things as: 
i. Science fiction beliefs. 


ii. Accepted religious beliefs, e.g., 
visions, commands from the Deity. 


iii. Feelings of mental telepathy. 
. Excluded are such things as: 
i. Feelings of having no blood, being 


possible delusion of the following classes: 

a. Feelings of having no blood, being a 
Zombie, being reincarnated. 

b. Feelings of being sustained by food 
from God, or other religious miracles. 

c. Feelings of being related to a changing 
set of important persons. 

. Science fiction beliefs. 

. Believing self in impossible child-par- 
ent role, e.g., fathered thousands of 
babies, married to a ne. child of 
Hitler, child of the Pope, or being 
pregnant, etc. 


. At least one aspect of real “regressive” 


behavior of the following classes: 
a. “Silly” - actual giggling or laughing, 
but not simply “inappropriate grin- 
ning”’. 


pregnant, being a Zombie. . Personal sexual delusions, e¢.g., some- 
ii. Feelings of being reincarnated, body playing with organs, being 

being related to a changing set of spray Dw - sex powder, etc., but not 

people. simply feelings of infidelity of partner 
iii. Feelings of being fed by God, or or impotence, etc. 

other non-culturally accepted con- c. Smearing or eating feces. Being in- 

tinuous religious miracles. continent. 


. No evidence of actual “silly’’ behavior; 
might have ne eg grinning’, 


. Not demonstrating any bitter feelings of 
being wronged or being the subject of 

but not actual ‘ gross injustices. 
ing. 

. No evidence of smearing and/or eating 
feces and/or being incontinent. 


ly” giggling or laugh- 
. No history of convulsive seizures and/or 
other signs of intracranial damage, e.g., 
neoplasms, cerebral trauma, lobotomy. 
. No history of convulsive seizures and/or 
other signs of intracranial damage, e.g., 
neoplasms, cerebral trauma, lobotomy. 


RatinG RESULTS AND DISCUSSION 


To determine the reliability of the criteria scheme, two judges* working in- 
dependently were given 207 case folders in samples of ten at a time and were asked 
to employ the criteria to select all those cases and only those which fit. These 207 
cases represented a sample which contained the same nosological proportions as 
existed in the hospital population from which the cases were drawn.‘ In addition, the 
judges had no knowledge that twenty “correct”? cases (ten hebephrenic and ten 
paranoid) had been buried in this sample of 207 cases. Happily, each judge correctly 
identified 18. One judge selected twenty cases, two of which were misidentified: both 
had some sort of cortical damage noted in their case folders. The two “correct” 
cases, were overlooked because the judge felt that the clinical picture had changed 
so much over the years, that classification was impossible. The second judge selected 
18 cases, all of which were correct. The two cases missed were both hebephrenics, 


‘Grateful acknowledgment is due to Dr. John MacGahan and Mr. John R. Lester for their grac- 
fous cooperation in serving as judges for this part of the investigation. 

‘Acknowledgment is made to the administration of the Norwich State Hospital, Norwich, Con- 
necticut for its cooperation in providing the patient population for this study. 

*It will be noted that inherent in this criteria scheme is the notion that the paranoid schizophrenic 
remains concerned with “self-respect”, with how. his behavior will be interpreted, etc., whereas the 
hebephrenic patient does not demonstrate any concern about how his behavior will i impress others. 
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rejected because the rater felt that the evidence for “regressive”? behavior in their 
charts was not clearcut. The degree of agreement between each judge and between each 
judge and the author yielded tetrachoric r’s of > .95. 


It seems, therefore, obvious that despite the pessimism in the literature con- 
cerning the reliability and objectivity of classificatory schemes, the behavioral 
criteria employed here proved reliable. One of the principal merits of this behavioral 
criteria scheme is that since it approximates the traditional and generally accepted 
descriptive nosology of psychiatric literature, as a diagnostic scheme, it is transfer- 
able from hospital to hospital and is independent of the particular nosological 
vagaries present in any one locale. 


SUMMARY 


Clinicians in designing research involving diagnostic groups are frequently faced 
with the problem of the unreliability of clinical nosology. Although this problem 
seems less acute in selecting broad groupings of patients, when the task is to employ 
subgroupings of schizophrenic individuals unreliability of diagnostic labeling becomes 
critical. 


A behavioral rating scheme for the selection of hebephrenic and paranoid 
schizophrenic patients is outlined. Rater reliability of this scale proved high. Inter- 
judge tetrachoric r’s were > .95. Since the diagnostic criteria scheme is closely de- 
rived from the traditional and generally accepted descriptive nosology of psychiatric 
literature, it has generality of application while its nondependence on face-to-face 
patient contact has obvious practical advantages. 


REFERENCES 


1. American Psychiatric Association, Committee on Nomenclature and Statistics. Mental disorderse 
Washington, D. C.: American Psychiatric Assoc., 1952. 

2. AnoyaL, Atice F. Speed and pattern of perception in schizophrenic and normal persons. 
Charact. and Pers., 1942, 11, 108-127. 

3. Cameron, N. The functional psychoses. In J. McV. Hunt (Ed.) Personality and the behavior 
disorders. New York: Ronald Press, 1944. 

4. Friedman, H. Perceptual regression in schizophrenia: An hypothesis suggested by the use of the 
Rorschach test. J. genet. Psychol., 1952, 81, 63-98. 

5. Henperson, D. K. and Grutespie, R. D. A text-book of psychiatry. (6th Ed.) London: Oxford 
Univer. Press, 1947. 

6. Lorr, M. Rating scales and check lists for the evaluation of psychopathology. Psychol. Bull., 
1954, 51, 119-127. 

7. Noyes, A. P. Modern clinical psychiatry. Philadelphia: Saunders, 1939. 

8. Orcet, 8. A. Clustering of verbal associates in schizophrenia and chronic brain syndrome. Un- 
published doctor’s dissertation, Univ. of Connecticut, 1956. 

9. Rapaport, D. Diagnostic psychological testing. Vol. I. Chicago: Year Book Publishers, 1946. 

10. Scumipt, H. O. and Fonpa, C. P. The reliability of psychiatric diagnosis: a new look. J. abnorm. 
soc. Psychol., 1956, 52, 262-267. 

11. Suakow, D. and Rosenzweia, S. Play technique in schizophrenia and other psychoses. IT. An 
experimental study of schizophrenic constructions with play materials. Amer. J. Orthopsychiat., 
1937, 7, 231-256. 

12. Srecet, E. L. Genetic parallels of perceptual structuralization in paranoid schizophrenia: an 
analysis by means of the Rorschach technique. J. proj. Tech., 1953, 17, 151-161. 

13. Wirrensorn, J. R. Symptom patterns in a group of mental hospital patients. J. consult. 
Psychol., 1951, 15, 290-302. 

14. Wirrman, Puytuts. Scale for measuring prognosis in schizophrenic patients. Elgin State Hosp. 
Papers, 1941, 4, 20-33. 





THE STABILITY OF TREE DRAWINGS AS RELATED TO SEVERAL 
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PROBLEM AND PROCEDURE 


The tree drawing has been assumed by Buck and other investigators“: 2» ©) to 
contain elements that may reveal important clues relevant to an individual’s self- 
identification. It is Landisberg’s®) contention, for example, that the tree taps basic, 
long standing feelings and self-attitudes and is extremely resistive to alteration. In 
view of the presumed stability of the tree drawing, it is thought that the amount of 
change which could be induced in the drawing of a tree in response to specific sug- 
gestions might be related to the personality variable of rigidity. The present in- 
vestigation is designed to ascertain whether there is a significant relationship be- 
tween selected Rorschach scoring categories hypothetically related to the degree of 
ideational and emotional rigidity and the amount of change which the patient is 
capable of producing in his drawing. 

Sixty-one consecutive patients referred for psychological testing comprised the 
original sample from which the experimental groups were drawn. The sample con- 
sisted of 57 male officers and enlisted men and 4 female enlisted personnel and 
civilian dependents, ranging in age from 17 to 55 years (Mdn. 23 years) and in intelli- 
gence from 55 to 129 (Mdn. WAIS IQ 102). The discharge diagnoses of these pat- 
ients were: 4 with no disease or some type of unrelated somatic illness, 10 with 
schizophrenic reactions, 5 with psychoneurotic reactions, 2 with transient personality 
disorders due to acute or special stress, 26 with character and behavior disorders, 
and 14 with some type of organic brain syndrome. 


All of these patients were tested individually and requested to make a drawing 
of a tree on a blank sheet of paper of uniform size. The original technique used by 
Koch“? was modified in the present experiment in that each patient was then re- 
quired to draw a second tree on a separate sheet of paper in response to the following 
instructions: ; 


“TI would like you to make another tree but this time try to make it as 
different from your first drawing as you can. Remember, you may continue to 
look at your first tree but make your second one as different as possible.” 


Any questions that were asked by patients were answered in a noncommittal way 
with no further attempt at clarification. 

Two methods for dichotomizing the original group were employed. The first 
involved a detailed analysis of various features of the two tree drawings made by 
each patient. Fourteen dimensions were rated and assigned values ranging from 
0 to 3, reflecting in each case the amount of change noted. Among the drawing char- 
acteristics evaluated were the length and width of the trunk and crown, the place- 
ment of the drawing on the paper, the type of line quality, the amount of shading, 
the number of extraneous objects included, the angle formed by the tree trunk and 
the ground, the number of branches, the amount of foliage, the display of roots, 
marks on the trunk, and the type of tree drawn. The resulting range of total scores, 
representing the amount of change occurring in the two tree drawings, was rather 
small, extending from 1 to 21 with a median score of 12. 

In order to determine the reliability of the scoring method, 11 drawings were 
selected at random and scored independently by another psychologist. The rank- 
difference correlation between the two arrays of scores was .94 indicating close agree- 
ment between raters. A check also was made on the discrepancy in total scores 


1The opinions and conclusions expressed do not necessarily represent those of the Department of 
the Air Force. 





THE STABILITY OF TREE DRAWINGS 163 


assigned to the tree drawings by these two raters. In 9 of the 11 sets of drawings the 
scores corresponded within 2 points indicating close agreement in absolute scores as 
well as in the relative position assigned the drawings. Using the method described 
above, the upper and lower 23% of the original sample were selected as being repre- 
sentative of the more flexible and less flexible groups, respectively. Fourteen sub- 
jects were assigned to each group. 

The second method for dichotomizing the original sample into high and low 
rigidity groups called for 4 judges to independently make global evaluations of each 
pair of tree drawings and to rate the observed change as minor, moderate, or large. 
Only those drawings upon which at least 3 out of 4 judges agreed were selected. The 
drawings judged to depict minor change were assumed to be representative of the 
high rigidity group (N = 16) and were compared with those characterized as show- 
ing a moderate or large amount of change. The categories ‘‘moderate” and “large” 
served as a basis for the low rigidity group as the use of both was necessary to make 
the sample sufficiently large (N = 17). 

As for the overlap in subjects resulting from the use of these 2 different methods 
of separating the high rigidity group from the low rigidity group, 42.9% of the 14 
patients selected by method 1 as being more rigid were also represented in the sample 
formed with the aid of method 2. Fifty percent of the 14 patients assigned to the 
low rigidity group by method 1 were also found to be in the low rigidity group as 
determined by method 2. In no instances were any reversals observed. 

The Rorschach protocols were obtained during routine examination by a psycho- 
logist who was not aware of the patients’ performance on the tree test. With the ex- 
ception of form level ratings, scoring was in accordance with Klopfer’s®) method. 
The groups high in rigidity did not differ from those low in rigidity in F+% or 
F+R%. 

The specific hypotheses formulated were that groups high in rigidity would 
differ from those low in rigidity in having: (1) a smaller number of responses; (2) a 


larger number of rejections; (3) a smaller percent of responses to cards VIII, IX, X; 
(4) a larger number of P and P%; (5) a more restricted content range as reflected in 
a higher A% and a lower percent of content outside A, Ad, H, Hd; (6) a higher F%; 
(7) fewer M; (8) a smaller Sum C; (9) fewer FC + CF; and (10) a larger number of 
FC in comparison to CE + C. 


RESULTS 


Table 1 presents data comparing the Rorschach scores of groups of patients 
who are considered to be high and low in rigidity. The first 7 Rorschach variables 
are presented as mean scores for the various groups whereas the remaining 5 are re- 
ported as frequency distributions. It was considered inadvisable to average per- 
centages. Throughout, significance was tested by use of the chi square method since 
this technique does not presuppose a normal distribution or equality of units. These 
are requirements which the Rorschach scores generally fail to satisfy. Chi squares 
were calculated by comparing the number of cases in each group falling above and 
below the median of the total sample. The Yate’s correction for continuity was used 
wherever applicable. 

It was found that with the exception of the significantly (p = .05) larger num- 
ber of responses produced by the low rigidity group (as selected by method 1) com- 
pared to the high rigidity group, none of the differences are statistically significant. 
Two other comparisons, however, fall just outside the .05 level, and approximate 
acceptable levels of significance. These are the FC:CF + C ratios for samples 
selected by methods 1 and 2. The differences are in accordance with the hypotheses. 
It is noteworthy that the average number of rejections produced by the high rigidity 
groups is greater than for the low rigidity groups and that the more flexible groups 
tend to make a greater use of M and of color as had been hypothesized. Although 
these differences are in the expected direction, they are not statistically significant. 
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Tase 1. A Comparison oF Rorscuacu Scores EARNED BY Groups oF Patients SHOWING 
DIFFERENCES IN THE STABILITY OF THEIR TREE DRAWINGS 
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SUMMARY 


Sixty-one patients were instructed to make two tree drawings deliberately as 
different from one another as possible. Comparisons of groups differing in the var- 
iability of the two tree drawings on selected Rorschach variables did not generally 
support the hypothesized relationships although trends in the expected direction 
were in evidence. 
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PROBLEM 


The purpose of this study was to investigate the relation between MMPI scores 
and scores on a music test. The MMPI was scored in the usual way and gave nine 
scores for each subject. The music test was scored by a method providing twelve 
“‘music feeling” scores for each subject. This study consists of the correlations ob- 
tained between the nine MMPI scores and the twelve “music feeling” scores. 


PROCEDURE 

Five hundred and fifty-three students in general psychology at the University of 
Maine were administered the group form of the Minnesota Multiphasic Personality 
Inventory in their regular laboratory sections. The MMPI was scored according to 
the standard directions. Five of the clinical scales were corrected with the K factor. 
Tests with T scores of 70 or more on any of the three validity scales of ?, L, or F wére 
discarded. 

Three weeks later, the same subjects listened to a series of fifteen musical re- 
cordings and were asked to make as many check marks as they felt could be used to 
describe the music. Each subject had a mimeographed form giving him the defini- 
tions of the twelve major “‘music feeling” categories investigated. These definitions “ 
were read to the subjects beforehand and each subject could consult these definitions 
as much as he wished during the administration of the music test. 

The subjects were told only that little scientific work had been done on responses 
to music and that we were merely interested to see how people actually respond to 
these musical selections. The check list used in the present study was as follows: 


Sorrow Yearning Love Wonder 
dejected lonely tenderness mysterious 
mournful longing soothing weird 
despairing aspiring compassionate foreboding 
melancholy regret affection awesome 

Joy Solemnity Eroticism Rage or Anger 
boisterous dignified seductive hate 
humorous majestic passionate infuriated 
blissful reverent amorous ruthless 
happy sublime voluptuous demoniacal 

Calm Cruelty Jealousy Assertion 
pastoral vengeful suspicion determined 
meditative fiendish fear martial 


serene teasing anxiety triumphant 
tranquil malevolent frustration bold 


While listening to each recording, the subject could check as many or as few of the 
descriptive terms as he wished. It will be noted that the check list was made up of 
twelve major categories with four additional descriptive terms for each of these 
categories. 

A count was made of the total number of checks'the subject made for each of the 
twelve categories (major term plus four additional descriptive terms) on all fifteen 
musical recordings. This constituted his score for each music test category and re- 
sulted in twelve ‘‘music feeling” scores for each subject. The recorded music used in 
this study was: 

1. “Songs of a Wayfarer’ Mahler (Columbia) 

2. “Sensemaya”’ Revultas (Victor) 


1This study was supported in part by a grant from the Coe Research Fund of the University of 
Maine, and in part by research grant M-1118 from the Nationa! Institute of Mental Health, 
National] Institutes of Health, U. S. Public Health Service. 

*Formerly at the University of Maine where the data were obtained. 
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“Gaite Parisienne” Offenbach (Victor) 

“Dance of the Seven Veils’’ Strauss (Victor) 
“Water Music Suite’’ Handel (Victor) 

“Buttons and Bows” Dinah Shore (Columbia) 
“You Can’t be True Dear’’ Griffen (Rondo) 
“Hair of Gold” Gordon MacRae (Capitol) 
“Beg your Pardon’? Eddy Howard (Majestic) 
“*N ow is the Hour’ Margaret Whiting (Capitol) 
“Golden Earrings” Jack Fina (MGM) 

““My Happiness” Betty Peterson (Decca) 

“Two Peasant Songs” Traditional (Columbia) 
“Panihida”’ Tschesnokoff (Columbia) 

“Calmly Flows the River” Traditional (Keynote) 


Some subjects did not complete the music test so their papers were discarded. 
Valid MMPI tests and completed music tests were available for 450 subjects, 299 
men and 151 women. The data of this study are based on these 450 subjects and con- 
sist of Pearson correlation coefficients between the twelve ‘‘music feeling” scores and 
the nine MMPI scores (raw scores, not T scores). 


RESULTS 


The intercorrelations between the “music feeling’ categories are given in 
Table 1. The results presented in Table 2 for the men and women show a large num- 
ber of significant correlations between the raw scores on the MMPI and the “‘music 
feeling”’ scores. 

For the men none of the twelve ‘‘music feeling’”’ scores are correlated with the 
hypochondriasis scores of the MMPI. Depression, on the other hand, yields five 
correlations significant at the .01 level of confidence for calm, love, eroticism, jeal- 
ousy, and cruelty. Hysteria produces a correlation significant at the .01 level for the 
psychopathic deviate scores, with calm, yearning, and eroticism. Also the masculin- 
ity-femininity scores yield one correlation significant at the .01 level for eroticism 
and one at the .05 level for joy. Joy and assertion are related to the scores for para- 
noia at the .01 level and jealousy and cruelty at the .05 level. Only wonder is related 
to the psychasthenia scores at the .01 level, while sorrow and eroticism are related at 
the .05 level. The schizophrenia scores correlate with joy and wonder at the .01 level. 
The largest. number of significant correlations are found between the ‘‘music feeling”’ 
scores and the hypomania scores. These are significant at the .01 level for sorrow, 
calm, cruelty and anger, and at the .05 level for joy, jealousy, wonder and solemnity. 

These data reveal two interesting facts. In the significant correlations the trend 
is in the positive direction between the MMPI scores and “‘music feeling”’ scores in all 
cases with the exception of the correlation between depression and calm and the cor- 
relation between schizophrenia and joy. While most of the significant correlations 


TaB.e 1. INTERCORRELATIONS BETWEEN THE “Music Fee.tine” CaTecoriEes 
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TABLE 2. SIGNIFICANT COEFFICIENTS OF CORRELATION BETWEEN MMPI Scorss anp “Music 
FEELING” Scores For 299 MEN AND 151 WoMEN 
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*Significant at 5% level (r equal to or greater than .113 for men and .159 for women). 
**Significant at 1% level (r equal to or greater than .148 for men and .208 for women). 


are relatively low, a few of them are high. The correlations between psychopathic 
deviancy and yearning and eroticism are .43 and .32 respectively. The relatively 
high correlation between masculinity-femininity and eroticism of .37 should not be 
overlooked. Schizophrenia produces the high correlation of .49 with wonder. The 
correlation between depression and cruelty is .50. 

For the women we note in Table 2 fewer significant correlations than for the 
men. The men gave twenty-six significant correlations while the women gave only 
fifteen. 

As with the men, it can be seen in Table 2 that none of the twelve ‘‘music feel- 
ing”’ scores are correlated with the hypochondriasis scores. In the case of the men, 
the eroticism score was correlated with the masculinity-femininity scores, but there 
are no significant correlations in the case of the women. Depression, on the other 
hand, yields only one correlation significant at the .01 level in contrast with the five 
that were obtained for the males. For the females this is between depression and 
sorrow. Only one correlation significant at the .05 level between hysteria and wonder 
is obtained. Psychopathic deviate scores yield two correlations significant at the .05 
level between yearning and love. Two significant correlations between paranoia and 
sorrow at the .01 level and paranoia and jealousy at the .05 level have been obtained. 
The schizophrenia scores correlate with sorrow and solemnity at the .01 level. It will 
be recalled that for the men, schizophrenia scores correlated at the .01 level with joy 
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and wonder. As in the case of the men, the largest number of correlations for the 
women are found between the ‘music feeling’ scores and the hypomania scores. 
These are significant at the .01 level for sorrow, wonder and jealousy, and at the .05 
level for yearning, love, cruelty, and anger. 

For the women we also find that the trend for the significant correlations is in 
the positive direction between the MMPI scores and ‘“‘music feeling’’ scores, but the 
proportional number of negative correlations is much higher than in the case of the 
men. Here, also, some of the correlations are high. While no exceptionally high cor- 
relations between the ‘‘music feeling’ scores and hypomania scores were found for 
men, in the case of women there are two high correlations. They are .33 between 
hypomania and sorrow, and .36 between hypomania and wonder. 4 

Sakoda et al“), in an article on a test of significance for a series of statistical 
tests, indicate the chance probability of obtaining at least n statistics significant at 
the .05 level and at the .01 level from N calculated statistics. Table 3 shows the 
number of significant correlations obtained for the men and women and the number 
which could have occurred by chance. It is not probable that obtaining 29 and 18 
significant correlations out of 108 calculated correlations was due to chance alone. 


TaBLe 3. NUMBER OF SIGNIFICANT CORRELATIONS OBTAINED AND NUMBER 
Wuicu Coup Have OccurrED BY CHANCE 








Level of 
Significance Men Women By Chance 


01 17 7 5 
05 12 11 9 








Total 29 18 14 





SUMMARY AND CONCLUSIONS 


Four hundred and fifty subjects were administered the group form of the MMPI. 
These tests were scored according to standard directions to yield nine personality 
scores. During a later laboratory period, the same subjects listened to fifteen selected 
recordings. They were asked to check as many descriptive terms as they felt could 
describe the music. The total number of checks made by each subject for all fifteen 
records for each of the twelve categories was counted and gave twelve ‘“‘music feeling’ 
scores for each subject. These scores were correlated with each of the nine person- 
ality scores separately for the sexes. The results permit the following conclusions: 

1. The technique of obtaining the “music feeling’ scores seems to be a valid 
one. Further work needs to be done to verify this. The technique has a reliability 
of .76. 

2. Both the men and women were “normal” according to the MMPI. 

3. Significant correlations were obtained for both sexes between all of the 
MMPI scores and the various ‘‘music feeling” scores. 

4. For both sexes hypomania resulted in most correlations with the “music 
feeling’’ scores. 

5. Most of the obtained correlations were positive. Why tendencies toward the 
various abnormalities measured by the MMPI are related positively with the 
“‘music feeling’’ scores should be investigated in future studies. 

6. It is suggested that this experiment be repeated with other groups of 
“normals”’ and groups in the various “abnormal” categories. 
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DRAWING OF THE HUMAN FIGURE! 
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PROBLEM 


The increasing use of human figure drawings and similar graphomotor tech- 
niques for diagnostic purposes raises important questions concerning the validity of 
the general and specific interpretations which are made from these drawings in the 
typical clinical setting. Figure drawing interpretations still have their basic roots in 
“a rather intuitive procedure based partially on general clinical experience and 
partially on a kind of figure drawing lore.’”’“’ The present study was undertaken to 
investigate empirically the validity of certain typical figure drawing molar and 
molecular signs which are generally interpreted as measuring aggressive components. 
(i, 2, 3, 4, 5) These signs are: 


. Heavy line pressure . Clenched fists 
. Large figure size . Nostril emphasis 
. Slash-line mouth . Squared shoulders 


. Detailed teeth . Toes in a non-nude figure 
. Spiked fingers 


The specific hypotheses under investigation are: Experimentally induced feel- 
ings of aggression, interpolated between two figure drawings presentation, will elicit 
the following in the second drawing: (1) Greater line pressure, (2) greater figure size, 
(3) greater number of specific details generally interpreted as representing ag- 
gression, and (4) greater overall subjective judgments of “‘aggression’’. 


PROCEDURE 
The subjects were 39 male and female attendants working in a state mental 
hospital. For the first of the two testing sessions all subjects were assembled in a 
large auditorium, seated at individual tables, and given the following material: 


1. A new No. 2 pencil, with a sharp point and the eraser removed. 


2. An interleaf pad of nine carbon and ten fine, white, 844 x 11 sheets of 
paper. To make sure that this method of measuring line pressure would not be 
discovered by any subject, opaque top and bottom white sheets were used in 
lieu of the fine white sheets and strips of paper one-half inch wide and folded in 
half lengthwise were stapled around all four sides of the pad. 


All subjects were then asked to draw a figure with the standard, permissive, 
figure drawing instructions. Subsequent to this administration, the total group was 
divided into a control group and an experimental group utilizing the following match- 
ing procedure. Three independent judges took each subject’s pad and working from 
the back forward (to avoid preparatory set and figural after image) marked down 
the number of carbon sheets between the actual drawing and the first white sheet on 
which there was “any carbon line beyond a reasonable doubt, regardless of size and 


intensity.”” The three judgments for each subject were then averaged, yielding a 


measure of line pressure termed Carbon Index (C. I.). Subjects were then ranked 
according to their C. I. and every other subject wasthen chosen for the experimental 
group. 

The second testing session took place exactly one week after the first, with all 
subjects assembling in the same auditorium. At this point the members of the con- 
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trol group, along with both examiners, then went to another large room. The physi- 
cal arrangement of both rooms was identical. The control group then proceeded to 
draw a figure with the same type of material and under the same instructions as the 
previous session. 

The examiners then returned to the experimental group, which had been kept 
waiting for twenty minutes, and commenced to give similar instructions for drawing 
a figure. At a prearranged signal, however, a supervisor of the School of Nursing 
entered and read the following fictitious announcement, which he claimed was from 
the State Director of the Budget: ‘“Due to a shortage of ward personnel and in- 
sufficient job applicants, the work week for all hospital personnel will be increased 
from 44 to 48 hours. Salary payments will remain the same. We are sorry to have to 
increase the work week by four hours but present conditions make it unavoidable.” 
The multitude of spontaneous complaints at this point gives rather clear support to 
the investigators’, the supervisor’s and, after testing, the subjects’ judgments that 
quite intense feelings of aggression were induced by this combination of keeping the 
experimental subjects waiting and by the frustrating and unpleasant announcement. 
Immediately following the announcement all experimental subjects proceeded to 
draw a figure. 

The C. I. for the second test was then determined for all subjects. All C. I. 
judgments and all measurements discussed below were always made without the 
judge knowing to which group the subject belonged. Each drawing was blocked off 
into a rectangle so that its length and width could be measured. These measurements 
for each figure were then multiplied, yielding the area of the rectangle—the measure 
of figure size. The other seven purported signs of aggression were tallied for each 
subject and for each group, with doubtful judgments occurring very infrequently. 
In addition to these specific measures of aggression, an overall subjective impression 
was made by two independent judges, to provide a further method of evaluation. 
The two clinical psychologists compared the two drawings done by each subject and 
judged, without knowing in which session the drawing had been done, which of the 
two was more “aggressive.’’ They were instructed to use as criteria ‘those cues that 
you habitually employ in your daily practice.” 


RESULTS 


The number of subjects who either increased or decreased in Carbon Index, 
figure size or number of qualitative signs is presented in Table 1. 

As can be seen from these results, utilizing the Wilcoxon and the Chi-square 
tests ‘®) for non-parametric distributions, statistical significance does not reach the 
usual .05 level of confidence for any of the several comparisons relating to line 


TaBLE 1. FiagurE DrawtnG CHANGES FoLLOWING EXPERIMENTALLY INDUCED 
AGGRESSION 








Drawing Subjects Subjects Statistical 
Characteristic Increasing Decreasing Test 





Line pressure: ; 
‘ C3 10 9 Wilcoxon 
E, vs. E: 10 5 Wilcoxon 
Ci C; vs. Ei E, Chi-square 


Figure size: , 
1 vs. Ce Wilcoxon 
E, vs. E; Wilcoxon 
Ci C; vs. Ei E: Chi-square 


Qualitative details: ; 
Cy Sign 

FE, vs. E, ign 
C, C; vs. E; Ey Chi-square 
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pressure or figure size. However, when comparing qualitative details typically inter- 
preted as aggression indicators, using the Sign ‘and Chi-square tests, the results are 
in striking contrast. The number of control subjects from whom an increase in 
aggressive details was elicited does not significantly increase from test one to test 
two. For the experimental group (those hearing the fictitious announcement) how- 
ever, there was a statistically significant increase in aggressive details from test one 


to test two. In addition, Chi-square analysis demonstrates a significant difference 
between the control and experimental groups. 


Subjective judgments of aggression by the two judges, Table 2, did not signifi- 
cantly differentiate the control and experimental groups; however, there was sig- 
nificant agreement? between judges. 


_Tasy E 2. _ Suasscrivs JUDGMENT OF Ac GRESSION IN ParRED DRAWINGS 


Co omp: red Judge A Judge B 
Group Test 1 Test 2 Test 1 Test 2 








C, vs. Ce 12 
E, vs E2 


Chi-square 








Discussion 

Line pressure and figure size do not hold up as valid interpretive signs of ag- 
gression. In the case of specific drawing details when considered together, however, 
the aggressive component clearly shows through. It would appear, therefore, that 
attention to specific drawing details provides a more fruitful means of tapping ag- 
gressive feelings than does attention to line pressure and figure size. Unexpected 
support for this implication is provided by the introspections of the two psychologists 
making the overall subjective judgments which, it is to be noted, were statistically 
non-significant. They both reported primary reliance upon line pressure and figure 
size as criteria for judgment of aggression and only secondary consideration of spec- 
ific details. Their lack of success would appear to be clearly consistent with the 
findings reported above. 

A dichotomy appears to have been formed here, with non-significant results be- 
ing found in those areas measuring aggression in the form of graphomotor tension 


and with significant results being found in those areas in which aggression is ex- 
pressed symbolically. 


SUMMARY 


The effects of experimentally induced feelings of aggression interpolated be- 
tween two figure drawing presentations were studied in terms of four variables: line 
pressure, figure size, a group of seven specific drawing details, and overall subjective 
judgments of aggression. Of these four variables hypothesized to be related to ag- 
gression, only the seven specific drawing details, as a group, did in fact relate to ag- 
gression. A rationale for these results was proposed viz. a dichotomy, graphomotor 
vs. symbolic expression of aggression. 
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STIMULUS-DETERMINANTS OF “SHADING” RESPONSES 
WILLIAM ECKHARDT 


State Hospital, Morganton, N. C. 


PROBLEM 


Rorschach introduced the ‘‘chiaroscuro” score to account for the simultaneous 
use of light-dark aspects of the ink blots: »- ™). Binder® added “‘soft fur” and 
“gruesome” scores to account for the use of light and dark aspects separately. 
Klopfer “? introduced further refinements, but according to Beck, ‘‘the light reaction 
is the one we know least about in the Rorschach test“: »- "®.”’ In order to clarify 
the determinants of “‘shading”’ responses, an experiment was designed on the basis 
of the following definitions and hypotheses: 


Overall intensity was defined as the degree of brightness from white to black. 
It was expected that this stimulus-variable would determine “‘achromatic color” 


responses and also “gruesome’”’ responses when the degree of brightness was 
black or dark gray. 


Brightness Contrast was defined as an abrupt change from one degree of 
brightness to another. It was expected that this stimulus-pattern would deter- 
mine ‘‘transparency” and ‘“‘toned-down depth” responses. 


Brightness gradients were defined as gradual changes between different 
degrees of brightness. It was expected that this stimulus-pattern would de- 
termine “‘texture’’, ‘diffuse vista’? and “diffuse movement” responses. This 
latter hypothesis was partly based on a suggestion made by Klein and Arn- 


heim: “‘Gradients of shading make for movement“: ». ®),” 


METHOD AND PROCEDURE 


Ten popular details were selected from the first seven Rorschach cards. All three 
brightness determinants were present in these details: gradients, contrast and in- 
tensity. These details were reproduced photographically in their original positions 
on cards similar in size to the Rorschach cards. In order to eliminate the brightness 
gradients, the same ten details were again reproduced in a two-toned pattern of 
brightness. Finally, in order to eliminate both gradients and contrast, the same ten 
details were reproduced in a monotone. This treatment provided three sets of ten 
experimental blots each, identical in form, but varying in the brightness variable. 

These experimental blots were presented to twenty subjects in the following 
order, using Beck’s numbers": ». **53); Card I-D4: Card II-D1; Card III-D7, D9; 
Card IV-W except DI, D5; Card V-D4; Card VI-D4, D8; Card VII-D4. Each sub- 
ject was first shown the gradients, then the contrasts, and finally the monotones, 
with these instructions: ‘‘Here are some cards at which I should like you to look. 
Just tell me what they look like to you or of what they remind you.”’ Only one res- 
ponse per card was required, following which an inquiry was made: “‘What about the 
card makes it look like that?’”’ Scoring was accomplished in accordance with the 
definitions given in the preceding section, using Klopfer’s symbols“: “C”’ for in- 
tensity, ‘‘k”’ for contrast, and ‘‘c’’ for gradients. 

Most of the subjects were college students, both graduate and undergraduate, 
but some of them were high school students. Ages ranged from sixteen to sixty-five 
years. Both sexes were equally represented. None of the subjects was aware of the 
purpose of the experiment, nor familiar with the Rorschach test. All of the subjects 


were of at least average intelligence, and suffering from no obvious emotional dis- 
order. 





STIMULUS-DETERMINANTS OF “SHADING”? RESPONSES 


RESULTS 


Form, achromatic color, orthodox movement and vista, toned-down depth, 
transparency, texture, diffuse movement and vista responses were made to the ex- 
perimental blots. Form, achromatic color, and orthodox movement and vista res- 
ponses occurred with equal frequency under all three experimental conditions. Any 
differences were not significant, at the five percent level, as measured by the chi- 
square test of significance of change ®: »». 2207), All probability figures given in this 
section are based on this test. 

Sixty-six texture responses (largely “furs’’, some “‘rocks’’), eighteen diffuse move- 
ment responses and sixteen diffuse vista responses were made to the brightness 
gradients only (p < .01), demonstrating that these responses were determined by 
these gradients. 

Sixteen transparency responses and fourteen toned-down depth responses were 
made to the brightness contrasts. A smaller, but not significantly different, number 
of these responses were made to the brightness gradients, but none at all were made 
to the monotones (p < .01). Since contrast was the only brightness variable in com- 
mon to both gradient and contrast blots, it was reasonable to infer that these res- 
ponses were determined by brightness contrast. 

Although achromatic color responses occurred with equal frequency to all 
three sets of experimental blots, it was inferred that these responses were determined 
by overall intensity since this was the only brightness variable which all three sets 
had in common. 

In addition to these formal differences physiognomic differences were also 
significant (p < .05). Brightness gradients were perceived as “‘life-like’’, whereas 
monotones and contrasts were perceived as ‘‘dead’’, suggesting that ‘‘gruesome”’ 
responses were determined by the darker degrees of overall intensity. A few proto- 
cols illustrate these differences: 


“The shaded cards have more life to them. The plain blots look dead and dormant.” 
“The plain cards are flat and dead. Shading gives expression, movement, life.” 
“Shaded cards suggest reality.” 

‘Plain cards are like shadows.” 

“Tn the plain cards, the blackness is more threatening.” 


SUMMARY 


An experiment was designed to clarify the determinants of so-called “shading” 
responses. Texture, diffuse movement and diffuse vista responses were determined 
by brightness gradients, defined as gradual transitions between various degrees of 
brightness. Transparency and toned-down depth responses were determined by 
brightness contrasts, defined as abrupt transitions between different degrees of bright- 
ness. Achromatic color responses were determined by overall intensity and gruesome 
responses were determined by the darker degrees of overall intensity. 
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THE SENTENCE COMPOSITION TEST 
WILLIAM W. MICHAUX 
Veterans Benefits Office, Washington, D. C. 


DESCRIPTION 


The Sentence Composition Test is a projective technique in which the subject 
is asked to make up and write out 20 sentences, each containing the word“‘because’’. 
Test interpretation is accomplished by a qualitative psychodiagnostic approach 
similar to that commonly used with sentence completion tests. 

Materials consist of a pencil and two sheets of ordinary blank, white bond paper, 
size 8 x 10'4 in. or 84x 1l in. The two sheets are stapled together at the upper left- 
hand corner. The examiner may prefer to provide a pen instead of a pencil, thereby 
making it more difficult for the subject to eliminate any “unintentional’’ portions of 
what he writes—these portions often being subject to useful interpretation. At the 
top of the first sheet of paper, the following instructions are mimeographed, typed, or 
imprinted with a specially prepared rubber stamp: ‘‘Make up, and write out below, 
20 sentences, using the word ‘because’ somewhere in each sentence. Number the 
sentences from 1 to 20, so as to keep count of them. Use this page and the blank 
page which comes after it.’’ The examiner may read these instructions aloud while 
the subject follows the printed wording; or the subject may simply be handed the 
materials and told that the directions are self-explanatory, and that he may read 
them for himself and then begin. The subject should also be told that he may take 
whatever time he needs to finish. Testing time is usually about 30 minutes, rarely 
exceeding 45 except with severely inhibited and seriously disturbed subjects. Before 
dismissing the subject, the examiner should read the protocol and make certain all 
words are legible. 


SAMPLE PROTOCOL 


The following protocol was given by a 30-year-old single white male veteran, 
unemployed, of average intelligence. It was obvious on even the briefest personal 
contact that he was an ambulatory psychotic. At the time of testing, he was being 
considered for outpatient psychotherapy. The Rorschach and other tests indicated 
a severe disruption of reality ties, and withdrawal into autistic fantasy and delusional 
thinking. His illness had been repeatedly diagnosed as paranoid schizophrenia. His 
Sentence Composition Test was as follows: 


1. Health is important to me because it is happiness, and without happiness there is no life. 
2. Tobe a success is not enough because love is to the soul as water is to the plant. 


3. If I marry someone who really loves me, it will be a paramount issue in my life because 
I am very lonely. 


4. Sex is important to me because of my inner drive, and has a vital effect on my personality. 


5. I know I can overcome my illness, because of my faith in God, who understands what I 
have been up against in the past and at present. 


6. I cannot understand many things, because people don’t understand me. 


7. I know I will overcome whatever is wrong with me, because where there’s a will there’s 
a way, and I trust the medics. 

8. To lose my sexual potency would be to lose my life, because my mind is set against any 
sexual irregularity. 


9. Iam not sure if I know what love really is, because I’ve been tricked so much by strang- 
ers. 


10. I know if I could get a decent job I’d be OK, because I’d be happy. 


11. I am not glad I am out of the service, because to get a medical discharge is an awful 
blow to me. 


12. I don’t like the guys in Washington because most of them are queers. 


13. To be given a decent break would be my cure, as my neurosis is aggravated by vicious- 
ness. 
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14. There’s something wrong with my left-hand side because I keep getting bad pains there. 
The VA (Veterans Administration) says no. 


15. I don’t get dejected much any more, because I am acquiring the ability to pull out of 
moods. 


16. I need medicine because of the tension in my face. 

17. I get aggravated because the doctors tell me I imagine it and medicine cures it. 
18. I would never go to a hospital because I know it would make me worse. 

19. I feel an injustice at times because psychiatrists generalize too much. 

20. When a medicine cures, an illness must be real, because a virus isn’t from nerves. 


This record exemplifies starkly how the Sentence Composition Test can yield 
personalized material which lends convincing substance to the inferences from other 
projective tests that otherwise would be more tentative and non-specific. Taking the 
record first grossly and globally, one senses in the patient’s sweeping generalities the 
unrealistic but vitally desperate hopefulness of his struggle for protection against 
trauma and for escape from his psychotic predicament. Quite candidly, the 
patient depicts himself as having recoiled in bewilderment to an egocentric position 
where self-preservation has become an all-consuming enterprise. He is pitted in 
solitary estrangement against an unfriendly, undependable world. In this exigency, 
he must devote all his resources to the effort to win acceptance, understanding, and 
love. 

A more analytical study of his record would educe, for example, the theme of 
personal omnipotence and magic as the only forces he can be sure are benevolent 
(sentences nos. 1, 2, 5, 7, 8, 15). A second prominent phase of his striving is ambi- 
valence, involving not only doctors (no. 7 as against nos. 11, 17, 18, 19), but virtually 
everybody (no. 3 as against nos. 6, 9, 12). Sentences 4, 8, and 12 suggest how sexual- 
ity has become involved in his illness. Hypochondria, somatization, and possibly 
somatic delusions are evident as modes of reaction to threat and trauma (nos. 1, 8, 
14, 16, 17, 18). He clings to platitudes like a drowning man clutching at straws 
(nos. 1, 2, 5, 7, 10). In the finished psychodiagnostic study of this patient, such ele- 
ments as these would be reintegrated into a structured picture of the person, to 
which other projective tests would make essential contributions. 


SUMMARY AND CONCLUSIONS 


In the Sentence Composition Test, the subject is asked to write 20 sentences, 
each containing the word ‘‘because.’”’ Protocols can be qualitatively analyzed by 
the use of established psychodiagnostic principles. The Sentence Composition Test 
has one distinctive feature which justifies its appearance among the perennial burgeon- 
ings of new projective techniques. It taps the same free-associative function as does 
psychotherapy, with minimal intrusion of stimuli which exert their own specific 
influences on responses (cf. stems in the sentence completion tests). The use of the 
word ‘‘because”’ helps to prevent repetitiousness and stereotypy, and makes for con- 
ciseness and personalized meaningfulness in the subject’s responses. Unlike many 
projective tests, the Sentence Composition Test can readily be administered to 
groups as well as to individuals. In addition, it yields a specimen of handwriting 
which can be studied psychodiagnostically. 





STRUCTURAL PROPERTIES OF BENDER-GESTALT 
TEST ASSOCIATIONS! 


ALEXANDER TOLOR 
USAF Hospital, Parks Air Force Base, California 


PROBLEM 

Although there is little objective information available on this particular aspect 
of its use, associations to the Bender-Gestalt Test designs are frequently used by 
clinicians as a projective device. Suczek and Klopfer®) have attempted to generalize 
on the basis of associations offered by a group of normal subjects by assigning tenta- 
tive meanings to these responses and suggesting relationships between each of the 
Bender designs and particular aspects of the personality. The present investigation 
focuses attention on the formal or structural properties of the Bender associations 
rather than on their symbolic significances. This type of analysis should furnish in- 
formation that might be helpful to the clinician who endeavors to draw inferences 
from the specific associations given by his patients. 


SuBJEcTs AND MreTHop 


A total of 50 neuropsychiatric patients hospitalized at Parks Air Force Base 
Hospital participated in the present study. Among this group there were 45 male and 
5 female patients. The average age of this sample was 26.8, ranging from 17 to 43 
years. In intelligence the patients ranged from an IQ of 77 to an IQ of 129, the mean 
WAIS quotient being 102.3. The following discharge diagnoses were assigned to the 
patients following extensive diagnostic evaluation: Character and behavior dis- 
orders, 18; psychoneurotic disorders, 5; psychotic (schizophrenic) disorders, 9; 
organic brain injury, 15; and no psychiatric illness, 3. 

During the course of individual testing, and following the Rorschach adminis- 
tration, every patient was requested to copy the Bender-Gestalt Test designs ac- 
cording to standard procedures“. Immediately thereafter there was a recal] phase 
during which the patient was requested to draw as many figures from memory as he 
could. Next followed a free association phase at which time the patient was shown 
each of the original configurations in the customary order and was instructed: 


“T would like you to tell me what each of these could stand for just as you had 
told me what each of the inkblots could represent. Use your imagination and 
please give me the first idea that comes to mind. Remember, I want to know 
what these designs could be.” 


On the basis of the first 20 patients’ responses to these instructions, a number of 
tentative scoring categories were at first set up for each of the designs. It was recog- 
nized that all degrees of refinement are possible in devising a classification system. 
After making many revisions which resulted in some categories being combined and 
others being discarded entirely, a fairly broad, meaningful, and objective descriptive 
system was finally decided upon. The various categories are presented below along 
with their respective meanings: 

1. Rejection. Patient indicates that he is unable to interpret the design. 

2. Non-specific response. An association which could apply equally well to any or all of the de- 
signs. (Examples: just a drawing, geometric figure, pattern or design). 

3. Descriptive response. The design is described, often in minute detail, rather than interpreted. 
(Examples: dots, periods, circle, diamond, references to spacing, curvature, etc.) 

4. Letter of alphabet. The design or part of the design is taken to represent a letter of the alphabet. 


5. Interpretation that takes into account only part of the design or one which does not show a 
meaningful relationship between parts. 
6. Interpretation in which parts are integrated into meaningful whole. 


1The opinions and conclusions expressed do not necessarily represent those of the Department of 
the Air Force. 
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Categories 5 and 6 were further subdivided as shown below: 
A. Concept entirely inappropriate for form of design. 
B. Lower level of abstraction—some attempt is made to interpret but there is little devia- 


tion from line concept or the concept is a forced, arbitrary one. (Examples: stars, 12 ants in 
a line). 


Higher level of abstraction. 


C. 
D. Movement projected into the figure. 
a. Free, unimpeded movement. 
b. Movement that is blocked or state of balance made explicit. 


RESULTS 


After the various categories had been decided upon, the Bender associations of 
the 50 patients were scored using this system”. The resulting frequency distribution 
is presented in Table 1. When the first four categories, which represent the most 
severe associational incapacitation, are combined, one notes that the frequency of 
this type of response departs significantly from an expected equal occurrence among 
the nine Bender designs. The Chi-square is 17.01 which for 8 degrees of freedom is 
significant at the .05 level. When a similar analysis is performed for the 5-level 
categories, which represent a lesser degree of associational disturbance, the distri- 
bution of responses again is found to vary significantly for the nine Bender figures 
(Chi-square 25.2, P < .01). A comparison at the highest level of abstraction and 
conceptualization, the 6-level with class 6A omitted, also yields significant differ- 
ences for the various Bender designs (Chi-square 24.1, P < .01). 
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Discussion 

Figure A appears to be the most difficult in terms of its associational stimulus 
value. Most frequently descriptions rather than interpretations are made in response 
to this design. Non-specific responses are not uncommon. Only about 20% of the 
patients are able to offer a well-integrated whole response of which 6% are of the 
kind implying tension (impeded movement or state of precarious balance). 

Figure 1 is notable for the frequency with which it elicits descriptive responses 
and the absence of any rejections. It would seem that it is fairly easy to respond to 
in Sane although a large number of patients can do so only on very primitive 
evels. 


*It should be noted that in those instances where a subject offered more than one response to a 
specific design, that association which reflected the higher level of abstraction was selected. This was 
done in order to make allowance for spontaneous improvements. 
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Figure 2 was found easier by these patients than Figure 1 since half of them 
were capable of responding with a well-perceived, higher-level whole association. 
There were no interpretations that took into account only part of this design. 

Figure 3 appears to be the easiest of the nine designs since 74% of the patients 


offered meaningful whole responses to it. Twenty-four percent of these involved 
free, uninhibited movement. 


Figure 4 is the second most difficult of the nine designs because of the frequency 
of rejections and the very few well-integrated whole responses. Descriptive responses 
are common and Figure 4 is the design in response to which letters of the alphabet 
were given most frequently. 


Figure 5 allows for a large number of higher level associations. Descriptive 
responses are rather rare. 


Figure 6 is another easy design. Although the majority of the responses re- 
flected a higher level of abstraction, a considerable proportion were whole responses 
in which there was little deviation from a simple line concept. Movement was never 
read into this design. Responses to only part of the figure occurred frequently. 


— 


Figure 7 is often rejected and often interpreted in a non-specific way. Those 
patients who were able to integrate the parts into a meaningful whole also always 
achieved a high level of abstraction on this design. 

Figure 8 differs little from Figure 7 in level of difficulty. The slightly fewer re- 
jections found here are negated by the more frequent descriptions. Movement was 
never projected into this design. 


These findings suggest that the Bender figures vary considerably in their stimu- 
lus value. Some of the figures appear to be so difficult for psychologically disturbed 
patients that many rejections, descriptions, and vague, non-specific associations are 
offered. The marked differences which have been observed in this experiment could 


be related to the differences in the structure of the designs or could be a function of 
the different emotional responses produced by the nine designs. In any case, it would 
seem appropriate for the clinician to take these differences into consideration in 
interpreting the symbolic significance of the Bender associations. 


SUMMARY 


Fifty neuropsychiatric patients were requested to associate to each of the nine 
Bender-Gestalt Test designs. The various types of responses offered were classified 
in an attempt to explore the formal or structural properties of these associations. 
Significant differences were obtained in the frequency with which each type of res- 
ponse was elicited by the nine designs. Discussion centered about the need to con- 


sider these structural differences in the symbolic interpretation of Bender associa- 
tions. 
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REPEATED TESTING OF FOUR CHRONIC SCHIZOPHRENICS ON THE 
BENDER-GESTALT AND WECHSLER BLOCK DESIGN TESTS* 


J. D. KEEHN 


American University of Beirut, Lebanon 


INTRODUCTION 


Although in some psychological tests claims are made for no more than con- 
current validity, in the majority of cases construct validity in the sense used by 
Cronbach and Meeh!“? is taken for granted. For diagnostic purposes, pure and 
simple, the former is often adequate but for the more general case of discovering 
ways in which groups of individuals differ from each other construct validity is 
obligatory. Within this general context the effects of learning and of temporary, 
occasional factors are important. Thus if, for instance, a particular test of ability X 
differentiates between psychotics and normals we would like to know whether lack of 
X is central to psychosis or whether this lack could be made good by learning, the 
essential psychosis remaining unaffected. Alternatively, X may not be lacking at all 
but its apparent absence be a function of fatigue, low motivation, inability to con- 
centrate, etc. In either case the construct validity of the test is at stake. 

One way to tackle this problem would be to administer repeatedly the same test 
or tests to a group of individuals and note variations in their scores. Apart from the 
dozens of reports on simple test-retest studies a few isolated studies of this kind are 
to be found in the literature. However most of them have been related to test score 
changes dependent upon specific external factors like therapy or maturation. Thus 
Pascal and Suttell’ report Bender-Gestalt score changes in patients undergoing 
ECT, and Rosenthal and Imber describe changes in the same test with patients 
taking mephenesin. A similar, limited study, using Wechsler-Bellevue changes is 
described by Mailer®. In addition to the longitudinal studies with intelligence tests, 
Allen“: ») has reported two studies of repeated Rorschach administrations to a 
normal child up to five years of age. 

These studies, however, all attempt to relate score changes to actual changes 
within the individual as does the report of Pascal and Suttell“ on an untreated 
patient. Bender“ reproduces five successive Bender-Gestalt protocols of an aphasic 
patient showing remarkable improvement on the test over an eight day period, but 
does not indicate other changes in the patient’s behavior. Our purpose in this study is 
to examine test score changes when there are no apparent major external or internal 
variations in the patients’ conditions. To this end the Bender-Gestalt and Wechsler 
block designs (Form II) were given repeatedly to four chronic schizophrenics of long 
hospital standing with only custodial care. 


PROCEDURE 


The four patients had been hospitalized for periods of 17 years, 5 years, 3 years 
and 5 years respectively and all had shown stable psychotic symptoms for at least 
the past three years. Patient A was a 56 year old male schizophrenic suffering from 
grandiose delusions of at least 23 years standing. Patient B was a 31 year old female 
schizophrenic who had been hospitalized for the past 5 years with symptoms of dis- 
orientation and lack of affect with periodic aggressive outbursts. Patients C (female, 
age 41) and D (male, age 34) were both paranoid schizophrenics whose delusions be- 
gan about 6 and 8 years ago respectively. 

These patients were tested from 13 to 15 times over a period of about 2 months 
with the Bender-Gestalt and Wechsler-Bellevue (Form II) block designs using 


*This study was made possible by a grant from the Rockefeller Brothers Fund to the American 
University of Beirut. The author wishes to acknowledge his indebtedness to the fund and to Dr. W. 
M. Ford Robertson, O. B. E., Medical Director, Lebanon Hospital for Mental and Nervous Dis- 
orders, for allowing patients under his care to be tested. 
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standard instructions, procedures and scoring given in Pascal and Suttell® and the 
weighted scores from the appropriate table in the Wechsler Manual, respectively. 
All told, testing occurred at about 4 dav intervals although there were some omissions 
due to administrative difficulties, sickness and holidays. Patients A and D, the 
males, were interested and cooperative throughout but patient C was often re- 
luctant to do the tests and refused on two occasions. The remaining female, patient 
B, was indifferent throughout but was too excited to be tested on two occasions. To 
prevent contamination of the results the tests were administered by one person and 
scored by another both of whom knew nothing of the purpose of the experiment." 


RESULTS 


Although in a limited study a detailed discussion of the findings would be pre- 
‘mature, the important data are summarized in Table 1 where the means of the first 
four and last four trials for both tests are shown together with the patients’ best and 
worst efforts. The data may be treated in terms of improvement with practice and 
variability of performance. 


Effects of practice. No systematic trends in Bender-Gestalt scores emerged over 
all the patients. This applied both to the total score and to the scores on the in- 
dividual designs. To see if the patients were generally better or not at the end of the 
series, means of the first four and last four scores on each design were calculated. 
These are listed for the four patients separately under the columns M, and M; in 
Table 1. In the majority of cases the difference between these means is less than 4 
points, the only substantial change being for the worse by patient C on design 1. On 
the total score patients A and C showed a slight worsening but patients B and D 
gained averages of 12.25 and 14.5 points respectively. The final averages of these 
latter two patients correspond to Z scores of 60 and 52 respectively on the Pascal 
and Suttell‘*’ norms for high school subjects. Were these norms applicable to our 
subjects, these scores would indicate some doubt about their need for psychiatric 
help: yet both are chronic schizophrenics of long standing. The remaining two 
patients even after about 15 attempts still obtained error scores well above those of 
any of Pascal and Suttell’s normal group. From our limited data we may conclude 
that some patients can improve their performance on the Bender-Gestalt designs 
with practice but that such improvement is erratic and probably limited to patients 
who perform tolerably well initially. In both patients who improved, the evidence\| 
suggested improvement on all designs rather than any particular one, although there 
was some suggestion that greater improvement occurs on design 4.? 

On the Wechsler block designs all four patients performed better at the end of 
the trials than at the beginning. These improvements again were irregular and quite 
small when converted into weighted scores. The most important finding with this 
test is that although there are definite improvements in the accuracy score where 
this is possible, there is no evidence for any improvement in the time credits. This 
finding is in accord with Shapiro and Nelson’s“? observation that schizophrenics 
tend to be slower than normals in general. Our findings suggest that this deficit is 
not overcome with practice. However it must be emphasized that no attempt was 
made to instruct the patients how to improve their scores on the tests. 


Variability. In addition to the overall effects of practice it is pertinent to ex- 
amine the fluctuations in performance. Marked fluctuations did occur in all four 
patients’ performances. These fluctuations were greatest on the Bender-Gestalt 
total scores and compare with those found by Pascal and Suttell “’ for ECT treated 
patients. However the present fluctuations occurred without treatment or apparent 
changes in the patients’ overall behavior. The best and worst performances of each 


1Miss A. Papamichael and Miss A. Sabbagh whose assistance is acknowledged together with that 
of Mr. B. al-Farr who administered the tests on a few occasions. 

*Detailed charts showing scores obtained by each patient on each testing occasion may be ob- 
tained from the author. 
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patient on the separate Bender-Gestalt designs and the block designs are shown in 
Table 1. The Bender-Gestalt total scores differ greatly in all cases. Even the best 
scores of 68 and 75 for patients A and C correspond to Pascal and Suttell high 
school Z scores of 103 and 110 respectively and are well within the abnormal range. 
However the corresponding Z scores for the raw scores 18 and 16 of patients B and D 
respectively are 50 and 48 and these are within the normal range for these norms. 
These best scores represent the best performances on the whole test by the patients, 
i.e., not the sum of the best performances on the individual designs. If we consider 
the best performance on each design irrespective of the occasion on which it was ob- 
tained then the sums of the “‘best” columns in Table 1 would represent the maximum 
possible performances for the patients. In this way patients A, B, C, and D could 
obtain raw scores of 51, 5, 35, and 10 which are Z scores of 85, 36, 68 and 41 respect- 
ively on the norms which we have been using. In this case only patient A scores 
definitely in the abnormal range with B and D scoring better than most of Pascal 
and Suttell’s normals. These results must, of course, be treated with considerable 
caution for we do not know how normals would react to repeated testing or if the 
norms apply to our patients. 

What is probably more important in this connection is that in every case the 
patients’ best possible performance substantially exceeded their best actual perform- 
ances. This would suggest that it is not so much the Bender-Gestalt material that 
the patients are unable to contend with so much as their inability to apply them- 
selves consistently to the material over the whole testing period. Similar results ob- 
tained for the block designs although to a lesser degree. In this case the general im- 
provement of all four patients was more marked than their score fluctuations, 
especially in terms of weighted scores. 


SUMMARY 


Four chronic schizophrenic patients were tested repeatedly with the Bender- 
Gestalt and Wechsler-Bellevue block design tests between 13 and 15 times at inter- 
vals of about 4 days. The two better scoring patients on the Bender-Gestalt im- 
proved after their initial performances; the remaining patients scored a little worse. 
All four patients improved on the block design but mainly through improved ac- 
curacy. 

Bender-Gestalt scores fluctuated greatly over the trials and the evidence sug- 
gested that better scores could have been obtained if the patients had been tested in 
several short periods. Thus for all four patients the maximum possible score obtained 
by adding their best scores on the separate designs, irrespective of the occasion on 
which it was obtained, exceeded by several points the actual best performance. 
Variations also occurred on the block designs but seemed less important than the 
overall improvement of the patients on this test. 
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RECALL OF THE BENDER-GESTALT DESIGNS BY ORGANIC AND 
SCHIZOPHRENIC PATIENTS: A COMPARATIVE STUDY! 


MARVIN REZNIKOFF AND TOM D. OLIN 
The Institute of Living, Hartford, Conn. 


PROBLEM 


As part of a recent study, Tolor®’ compared the ability of three groups of 
patients to recall the Bender-Gestalt designs. After making an adjustment for the 
effects of intelligence, he found significant differences in recall between organic 
patients without convulsive disorders, organics with convulsive disorders and 
patients with primarily non-psychotic psychogenic disorders. Both the patients 
with psychogenic disorders and the convulsives were superior to the organic, non- 
convulsive group in memory for the designs. The purpose of the present study was 
to determine whether patients with severe psychogenic disorders, that is, schizo- 
phrenia, could be differentiated from a group of organic patients in their ability to 
recall the Bender-Gestalt designs. In addition, this research served to check some of 
Tolor’s findings with an independent group of organic patients. 


PROCEDURE 


Subjects. The subjects included in this study were 33 patients having diagnoses of 
organic brain disease, and 50 patients with diagnoses of schizophrenic reaction. The 
organic group contained 19 patients with non-convulsive organic diagnoses and 14 
patients with convulsive diagnoses. A variety of subtypes were included in the 
schizophrenic group. No organic patient was diagnosed as having a functional in- 
volvement, although in two cases the question of psychoneurosis was raised, and 
none of the schizophrenics were found to have an organic involvement. The organics 
came from the same population used by Tolor at the Neurological Institute, Colum- 
bia-Presbyterian Medical Center, New York. All of the schizophrenics were patients 
at the Institute of Living, Hartford, Connecticut. 

In order to obtain a certain degree of homogeneity in the groups, only those 
patients were included in the study who were between 15 and 50 years of age and 
had an IQ of 80 or better. Table 1 presents the composition of the groups with respect 


TasLe 1. ComposITion oF Groups " 


Non- Total 
Factors convulsive Convulsive Organics Schizophrenics 








N 19 14 33 50 
Aar: Mean 41.3 30.9 36.9 30.9 
Range 30-49 18-46 18-49 18-49 
IQ: Mean 103 .2 96.1 100 .2 107.5 
Range 87-124 81-118 81-124 80-128 





to age and IQ (¢ tests were performed to ascertain whether the desparities reached 
significance). While the differences in ages between the non-convulsives and convul- 
sives, as well as between all the organics and the schizophrenics are significant, the 
correlations between age and recall indicate that age is not significantly related to 
the ability of patients to recall the Bender-Gestalt designs within the age range in- 
cluded in the present study (18 - 49). The r was —.07 for the total organic group and 
—.04 for the schizophrenic group, when a stringent measure of recall was correlated 
with age. The difference between the groups with regard to IQ was found to be 
insignificant. 


'The authors are indebted to Miss Louise Hewson, Director, Department of Psychology, Neuro- 
logical Institute, New York, for her cooperation and helpful suggestions with regard to this project. 
They also wish to thank Dr. Alexander Tolor, for providing the authors with his material on the recall 
of Bender-Gestalt designs. 
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Test Administration. The Bender-Gestalt tests were given to patients referred for 
routine psychological testing as part of the usual test battery. The directions for the 
recall consisted of asking the patient to reproduce as many of the designs as he could 
from memory, immediately after he had finished copying the designs. Tolor used a 
somewhat different procedure for the recall. He asked his patients, in addition to the 
above directions, to reproduce any parts of the designs which they could remember. 
Because of this extended procedure, the recall scores obtained in the two studies are 
obviously not entirely equivalent. 

The problem of finding suitable criteria for measuring the recall immediately 
presented itself when the data were first examined. While the straightforward pro- 
cedure of counting the number of designs recalled appeared to be the most obvious 
method, many of the designs reproduced were so far from the original stimuli as to 
pose a significant problem in scorimg. It was finally decided to utilize three measures 
of recall; a Good Recall Score, a Weighted Recall Score, and a Total Recall Score. 
The Good Recall Score consisted of all those designs produced in which there were no 
major distortions, omission of parts, or rotations. The Weighted Recall Score was 
composed of those designs included in the Good Recall Score plus one-half credit 
given for each additional design reproduced in which one or more of the above im- 
perfections were discernible. The last measure, the Total Recall Score, was made up 
of all those designs produced in which any portion of the original stimulus was recog- 
nizable. The maximum score for each of the three measures was of course nine, the 
number of designs in the Bender-Gestalt Test. The Total Recall Score was felt to 
be most comparable to Tolor’s recall measure. 


RESULTS AND DIscussION 


Recall Scores. In order to compare the present findings with those of Tolor, the 
convulsives and the non-convulsive organic patients were compared on each of the 
three measures of recall. Contrary to Tolor’s results the two groups obtained almost 
identical mean scores. A possible explanation of these discrepant results is suggested 
when the age ranges of Tolor’s organic groups are compared. The non-convulsive 
group had a range of 12 to 72 years while the convulsive group had a range of 12 to 
56 years. A recent study by Obrist® using a variety of memory tests strongly sug- 
gests a marked reduction in the ability of elderly people to retain designs. It is not 
unlikely, therefore, that the differences in recall between Tolor’s convulsive and non- 
convulsive groups are partially attributable to the older patients in the non-convul- 
sive group. It should also be mentioned, however, that the lack of agreement be- 
tween the two studies may be due as well to differences in administering and scoring 
the design recall. 

Because the two organic groups showed no differences, they were combined and 
the total organic group was then compared with the schizophrenic group. Table 2 
gives the results of the three measures of recall in the total organic and schizophrenic 
groups. As can be seen, the two groups do not differ significantly with regard to 
either the Weighted Recall Score or the Total Recall Score. The difference between 
Good Recall Score means, however, is significant at the .05 level, indicating that the 
schizophrenics are able to reproduce a greater number of designs accurately. While 


Tasie 2. A Comparison oF AVERAGE NUMBER OF DESIGNS REMEMBERED BY 
ORGANICS AND SCHIZOPHRENICS ON THREE MEASURES OF ScortnGc RECALL 





Measures Organics Schizophrenics ies 





Good Recall Score 3.33 4.16 2.00* 
Weighted Recall Score 4.43 5.09 1.57 
Total Recall Score 4.88 5.30 1.33 





*Significant at .05 level of confidence. 
** Although the means presented in this table are based on the actual number of 
designs recalled, the t values were calculated from an arc sine transformation of 
the data. This transformation equalizes the variances in the groups. 
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the differences between schizophrenics and organics on the Good Recall Scores are 


significant, the overlap between groups is so large that this measure cannot be used 
for individual prediction purposes. 


Individual Design Recall. A breakdown of the frequency with which the organics 

and schizophrenics recalled individual designs, applying chi square to test for sig- 

nificance, is presented in Tables 3 and 4. Only the Good Recall and the Total Recall 
TasLe 3. A CoMPARISON OF THE PERCENTAGES AND PERCENTAGE RANKS OF 


ORGANICS AND SCHIZOPHRENICS REMEMBERING EACH OF THE NINE DesiGnNs— 
Goop REcaLu Scorina 





Organics (N =33) Schizophrenics (N =50) 
0 Rank Rank Chi Square* 


.842 
2.386 
6.759** 
-125 
-000 
.127 
-000 
1.523 
382 
al 12.144 


Design 


SS 





A 33. 
54. 
30. 
18. 


= 


or 
WSO woe 
RK WONWONA Wer 
cooocoeocscoo 


Zoe 


3 
SCO CI ONWONS 





*Chi square values were derived from actual counts. 
**Significant at the .01 level of confidence. 


Tasie 4. A CoMPARISON OF THE PERCENTAGES AND PERCENTAGE RANKS OF 
ORGANICS AND SCHIZOPHRENICS REMEMBERING EAcu OF THE NINE DesiGns— 
Tora Recauy Scorine 





Organics (N =33) Schizophrenics (N =50) 
Design & Rank a Rank Chi Square* 





A 72.7 
1 57 .6 
36.4 
27.3 
18.2 
69.7 
75.8 
45.5 
84.8 


5 573 
2 3.019 
4 7.824** 
8 -000 
9 -000 
6.5 -712 
3 -000 
6.5 -798 
1 
T 


made OWT 


.000 
otal 12.926 





*Chi square values were derived from actual counts. 
**Significant at the .01 level of confidence. 


measures are reported since the attempt here is to illustrate the differences in fre- 
quency of recall when maximally different scoring procedures are used. Also shown 
in these tables are the designs ranked from most often recalled, given a rank of 1, to 
least often recalled, given a rank of 9, for both the organic and schizophrenic groups. 

Only on Design 2 does a significant difference in frequency of recall occur, with 
the schizophrenic patients remembering this design substantially more often than 
the organic patients. When all nine designs are considered together, the difference 
between the groups, estimated both by summing the chi square values of the in- 
dividual designs and comparing the ranks, is found to be insignificant. This would 
indicate that the schizophrenic and organics tend to have essentially the same pat- 
tern of recall—that is, the designs which are difficult for the organics to remember 
are, by and large, difficult for the schizophrenics to recall. 

In general, these findings with regard to relative frequency of recall of the 
nine designs, are similar to those reported by Tolor. There are, however, marked 
differences in the incidences with which several of the individual designs were re- 





186 MARVIN REZNIKOFF AND TOM D. OLIN 


called. This is most marked for Design 7 when the Good Recall measure is used. 
Tolor’s psychogenic group, for instance, remembered this design approximately 73% 
of the time, whereas the schizophrenics in the present study recalled this design only 
22% of the time. While it is possible that part of this discrepancy is due to the great- 
er severity of mental illness in the schizophrenic group, the fact that the discrepancy 
is considerably reduced when the more global Total Recall Score is used, again sug- 
gests that many of the inter-study differences may be largely a function of the 
measurements employed. 


SuMMARY 

The present study was designed to investigate the ability of schizophrenic 
patients to recall the Bender-Gestalt designs as compared with a group of organics. 
In addition, an attempt was made to check Tolor’s findings with an independent 
group of organics taken from the same population used by him. The results indicate 
that the schizophrenic patients recalled a significantly greater number of designs, 
on the average, than did the patients in the organic group, when a stringent measure 
of recall was used. A comparison of the relative frequency with which each design 
was recalled showed that Design 2 was remembered significantly more often by the 
schizophrenic group than by the organic, though the overall pattern of recall for the 
nine designs was quite similar in both groups. Certain of the findings were essentially 
in agreement with Tolor’s results; this study did not, however, confirm that patients 
with convulsive disorders are superior to a non-convulsive group of organics in 
ability to recall designs. The need for standardization of the scoring and administra- 
tion of the Bender-Gestalt recall was suggested if comparison is to be made between 
studies reported by different investigators. 
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THE PORTEUS MAZES AND BENDER GESTALT RECALL 
BERNARD 8. AARONSON 
New Castle State Hospital, New Castle, Indiana 


PROBLEM 


In a recent article, Tolor® reported a correlation of .50 between fullscale 
Wechsler-Bellevue IQs and Bender-Gestalt recall. This is in marked contrast to the 
lack of any relationship between the number of Bender-Gestalt figures recalled and 
Shipley-Hartford scores noted by Aaronson, Nelson, and Holt and the correlation 
of .19 between these same two variables noted by Peek and Olson®?. It was decided 
to test this relationship again in an epileptic population with a high incidence of 
feeblemindedness. Since the Porteus Mazes and the Bender-Gestalt both involve the 
traversing of lines through space, it was felt that these tasks were more analogous 
than Bender recall was with a more verbal task like the Shipley-Hartford. In addi- 
tion, the reported correlation between the Shipley and the Porteus is relatively 
low ®), so that a variable which was unrelated to the Shipley might be related to the 
Porteus. 
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PROCEDURE 


An exhaustive sampling of the files of the psychology department yielded a 
sample of 104 cases to whom both tests had been administered, 42 males and 62 
females. No individual was included in the sample who was unable to complete the 
Porteus Mazes at a three-year level. The sample ranged in age from 12 to 72. The 
mean age was 31.5 with a standard deviation of 14.0. The test quotients on the 
Porteus Mazes ranged from 20 to 123 with a mean quotient of 72.9 and a standard 
deviation of 23.0. The number of figures recalled on the Bender-Gestalt ranged from 
0 to 9 with a mean of 3.1 and a standard deviation of 3.6. 

The procedures for administering and scoring the Porteus Mazes were as given 
in the manual? for this test. The procedure for administering and scoring the 
Bender-Gestalt followed that given by Aaronson, Nelson, and Holt“). Product- 
moment correlations were computed between these variables and between these 
variables and age at time of testing. Subsequently, a partial correlation was derived 
in order to eliminate the effect of age. 


RESULTS 


The correlation between Porteus Maze score and age is -.20, significant at the 
.05 level, and between Bender recall and age is —.13, which is not significant. The 
correlation between Porteus Maze score and Bender recall is .46 (p < .0003). When, 
however, the effect of age is partialled out, the correlation between Porteus Maze 
score and Bender recall is .21 (p = .05). 

These data suggest that while a small relationship exists between Bender recall 
and Porteus Maze scores, the major portion of the relationship found is a function 
of aging. When the effect of age is removed, the resulting correlation, while signifi- 
cant, is not sufficient to permit any prediction of intelligence from the number of 
Bender figures recalled. The correlation of .46 obtained when age is not held con- 
stant, is close to the correlation of .50 reported by Tolor. The groups used by him 
are highly heterogeneous with respect to age, but he does not report any control for 
the effect of age on his correlation. In the present instance, where the high incidence 
of feeblemindedness and the large variability in the population should all act to pro- 
duce a high correlation, the low partial correlation of .21 when age is held constant 
suggests that there is only an insignificant relationship between these variables in 
more normally distributed populations. 


SUMMARY 


A comparison of Porteus Maze scores and the number of Bender figures recalled 
by a sample of epileptic subjects with a high incidence of feeblemindedness suggests 
a moderate correlation of .46 between these two variables, which shrinks to .21 when 
the effect of age is held constant. These data suggest that there is no practical re- 
lationship between recall of Bender figures and intelligence. It is suggested that the 
recent report of a correlation of .50 between full scale Wechsler IQs and Bender recall 
is a result of not having held the factor of aging constant. 
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THE EFFECTS OF RESERPINE ON PSYCHOTIC PATIENTS OF VARYING 
DEGREES OF ILLNESS: A PILOT STUDY* 


PAUL HAUCK, HENRY PHILIPS AND RENATE ARMSTRONG 
East Moline State Hospital, Illinois 


INTRODUCTION 


Since the appearance of the new ‘‘wonder drugs’’ their effects and value have 
been evaluated mainly from the behavioral point of view.“ * 5 © As yet, few at- 
tempts have been made with the use of psychological tests to study the patients be- 
ing treated with these drugs. Improvement as a result of drug therapy has often 
enough been noted but the important question regarding the nature of the change 
has barely been attempted. The present study has two goals. First, it attempts to 
determine how much change, if any, occurs in a psychotic population treated with 
reserpine, and secondly, are such changes fundamental ones revealing basic person- 
ality changes in the adjustment patterns as revealed through projective tests? 


PROCEDURE 


Four groups of patients at the East Moline State Hospital were selected from 
four wards. Group I consists of the nine most severely regressed male patients from 
the hospital’s most disturbed ward. Group II consists of ten female patients from 
the same type of disturbed ward as Group I except that the ten best patients of this 
ward were selected. Group III included thirty-four males from a veteran’s ward 
with moderately regressed patients. This group was divided in half, seventeen form- 
ing an experimental group, seventeen to a control group. Group IV comes from one 
of the better wards in the hospital to which females are sent after intensive treatment 
for maintainance and follow-up treatment and from which they are discharged, if 
that should be undertaken. The total sample consists of 67 patients, 17 of whom did 
not receive drugs because they were used as a control group with a group of 17 
which did receive drugs. 

We have then three distinct levels of adjustment as represented by these four 
groups (groups I and II may for practical considerations be classed together.) The 
mean age, time spent in a mental hospital, and broad diagnostic labels for the 
sample are found in Table 1. 


TasLe 1. Composition oF Four Groups, ComprisinG 67 PATIENTS, ON WHICH 
THE EFFects OF RESERPINE WERE STUDIED 





Group N Sex Mean Age Mean Years Type of 
In Hospital Psychosis 


I M 39.7 schizophrenics 
2 schizophrenics 
with organicity 
II 10 48 . 9 schizophrenics 
: involutional 
Il 


Experimental 17 





14 schizophrenics 
2 organics 
Control 17 13 schizophrenics 
3 organics 
12 
2 


IV 14 : 5.33 schizophrenics 


organics 





Before treatment commenced, the sample was given the Rorschach, the Figure 
Drawing Test, and the Bender-Gestalt. After treatment had been given for two 


*A eee is acknowledged to the following for their assistance in this study: Dean Tasher, 
Leonard Braunstein, Joseph Klass, and George Hoegh. 
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months, the entire sample was retested. Treatment was done with reserpine (Ser- 
pasil), 5 mgm. given intramuscularly every other day for 15 injections. Along with 
this 2 mgm. were administered orally every night for two months. 


RESULTS 

An analysis of Rorschach scores tested for significant differences with Festinger’s 
F Test“: ?) showed no significant changes had taken place in the personality struc- 
tures of the patients as a group which could not have occurred through chance factors 
alone. Of the 160 tests for significance that were done on 32 Rorschach scoring 
categories, only three proved significant at the .05 level (Table 2). These differences 
because they are so few, and are among the unimportant Rorschach categories, 
suggest that the drug made no difference. 


TABLE 2. SIGNIFICANT DIFFERENCES BEFORE AND AFTER TREATMENT 








Means 
Group Category Before and After a 





II H 2.40 to 1.00 2.40 
Hd 1.30 to .20 6.50 

IIT A Obj. 12 to .76 6.33 

(Between experimental and control group after treatment) 





Larger Mean 2N (Mi)? 
. Cpa Snes degree of freedom, = —————— 
Smaller Mean (SD,)? 


» 2N (M;)? 
degree of freedom, = —————— 
(SD:)* 


As a check against these findings the data were studied qualitatively by three 
psychologists experienced in projective testing. The test batteries for each patient of 
the study were reviewed separately by each examiner. The first administration of the 
Bender-Gestalt, Figure Drawing and Rorschach was studied in toto and the identical 
procedure was followed with the test battery gotten two months after treatment start- 
ed. At no time did the examiners know which test battery was given first and which 
second. Their task was to determine on a subjective basis from the two test batteries 
whether or not the patient had regressed, remained unchanged, or improved. For 
each patient a combined judgment was reached from the three separate judgments. 
This was fairly easy and presented no problems because of the close agreement be- 
tween the three judges. A chi square test for significance was then made on the 
frequency with which the three categories appeared (worse, same, better) in com- 
parison to the frequency of their occurrence had only chance factors been operating. 

Significant differences seem to have been approached at times but in no case 
were they obtained (Table 3). Clinical observations made by the psychologists and 
those made by the attendants in charge of the sample seemed to be in basic agree- 
ment. Specifically, the attendants in charge of group I felt two patients had calmed 
down somewhat, but the change was not enough to have them transferred to a 
better ward. The patients in the other groups were also thought by the attendants 


Tas.e 3. Stratus oF Patients AFTER TREATMENT 





Group Worse Same Better 


I 1 4 4 
II 0 7 3 


III 
Experimental 12 5 
Control 14 2 


IV 3 9 2 











190 PAUL HAUCK, HENRY PHILIPS, RENATE ARMSTRONG 


to have calmed down. This seemed to be the general impression for all the groups 
except group IV. These females were the best adjusted of the entire sample and as 
a group reserpine seems to have stirred some of them up more than the other groups. 
Examination of test data indicated that three patients of group IV appeared sicker 
after treatment, nine remained the same, and two improved. 

This paper is being written one half year after the data were collected. Treat- 
ment has continued for different length periods for these patients so that some are 
still on the drugs, others have been put on chlorpromazine, and some have dis- 
continued it entirely. Among the very regressed patients in groups I and II, none 
have been discharged and three out of the 19 have been transferred to the next best 
ward. One patient who was originally in the group, but refused to take medication 
was transferred. The experimental group of group III has not had a single patient 
discharged, but two have been transferred to a better ward. The control group 
which received no drugs had one patient discharged and one transferred. Thirteen 
of the fourteen in group IV are still on the same ward, and one patient was given a 
conditional discharge. 

In every case it was found that any changes which occurred could not be at- 
tributed to other than chance fluctuations. Thus, the two questions which we sought 
to find a tentative answer for seem to justify the following remarks: (1) Contrary to 
many other studies we find no significant change in our patient population which 
seems to be solely the result of drug therapy. (2) Projective psychological tests re- 
vealed that no basic change had taken place in the personality adjustment patterns, 
a not unlikely finding since our first question was answered negatively. 


CONCLUSION 


To determine how much change, if any, occurs in a psychotic population which 
has been treated with reserpine, and if such changes are fundamental ones in the 


personality adjustment patterns as revealed through projective tests, four groups of 
institutionalized patients representing three levels of illness were treated with oral 
and parenteral doses of reserpine for two months. An examination of scores of the 
Rorschach tests administered before and after treatment failed to show significant 
changes as did a larger test battery including the Rorschach, Figure Drawing and 
Bender-Gestalt which were judged qualitatively by three judges. 
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AN EXPLORATORY STUDY OF SOME PERSONALITY 
CHARACTERISTICS OF GAMBLERS* 


ROBERT P. MORRIS 
Boston University 


There has been little investigation of the personality characteristics of gamblers. 
Theoretical treatments have come from psychoanalysis“: *: ®), primarily concerned 
with the neurotic gambler. In a lone empirical study, Hunter and Brunner“ ad- 
ministered intelligence tests, the Bernreuter Personality Inventory and the Colgate 
Introversion-Extraversion Scale to a large number of avid college gamblers but found 
no common constellations. The present study, exploratory in nature, is an attempt 
to find such trait constellations, utilizing a classification scheme. 


METHOD 


Subjects. S’s were all Harvard College undergraduates, 29 gamblers and 19 
non-gamblers. The two groups did not differ statistically with respect to age, class 
year, or economic group. 


Classification. Based on their responses to a questionnaire aimed at determining 
motives for gambling, the 29 avowed gamblers were classified as “‘thrill’’: 12 8’s; 
“economic’”’: 5 S’s; and “‘other’’: 12 8’s. ‘Thrill’? gamblers were defined by Berg- 
ler’s“) six criteria-symptoms of habitually taking chances, the game precluding all 
else, being sure he’!! win in the long run, never stopping when winning, eventually 
gambling for relatively too-large sums, and seeking, consciously or unconsciously, the 
pleasurable-painful sensation of gambling. ‘‘Economic’”’ gamblers were those who 
actually gambled in such a way as to make money; 7. e., the “percentage player’, or 
the “shark”. With them, gambling is a business venture. The “Other’’ gamblers 
were a miscellaneous group whose members fitted neither of the above categories. 


Predictions. For traits where theoretical and/or empirical bases existed, be- 
tween-group predictions were made: (a) gamblers wil! show lower security than non- 
gamblers, as measured by the Maslow Security Scale ®; (b) gamblers will show lower 
social responsibility than non-gamblers, as measured by the Gough, McCloskey and 
Meeh! Scale for Social Responsibility “; (¢) gamblers will show higher dominance 
than non-gamblers, as measured by the Gough, McCloskey and Meehl Scale for 
Dominance®?; (d) gamblers will show higher femininity than non-gamblers, as 
measured by the Gough Femininity Scale“; (e) there will be a greater discrepancy 
between gamblers’ opinions of themselves and what they feel their friends’ opinions 
are than will be true for non-gamblers, as measured by the ““X-O Distance’’ in Gold- 
ing’s Happiness Scale“; and (f) gamblers will be avowedly less happy than non- 
gamblers, as measured by Golding’s Happiness Scale. 

Corrollary hypotheses were made relating to differences among the sub-categor- 
ies within the gambling group itself, using the same measures as between groups. 
(A more complete description of methodology is in Morris®®). 


RESULTS 


This was an exploratory study in which it was hoped relationships worthy of 
further study would be uncovered, therefore the 10% level was used as a working 
significance level. Results attaining this level are reported. 

Table 1 indicates the findings for the between-group predictions. Gamblers 
tended to be more secure, dominant, and masculine than non-gamblers; they ap- 
peared less responsible and tended to assert a greater discrepancy between inner and 
outer selves. In examining the subgroups of gamblers there were thirty possible com- 


The author wishes to thank Dr. Gardner Lindzey for his help in the planning and execution of 


this study, and Dr. Paul G. Daston for his help in preparing the manuscript, based on a study done 
by the author at Harvard in 1953. 
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TABLE 1. Comparisons BETWEEN GAMBLERS AND Non-GAMBLERS 








Measures U-test 
z = 1.80 


= 3.92 


Direction of difference 





Security Gamblers > Non-gamblers 





Responsibility 





Dominance 





Femininity Gamblers < Non-gamblers 





X-O Distance 


Social | Gamblers < Non-gamblers 
| 
| Gamblers > Non-gamblers 





Avowed | No differences 
Happiness | 


*In reverse of the predicted direction. 


Gamblers > Non-gamblers | 1.37 
| 








parisons: for each of the five variables', one could compare each subgroup with (a) 
the otker two subgroups combined, and (b) with each other subgroup separately. 
Of these thirty comparisons, ten reached our working significance level, and are re- 
ported in Table 2. 


TaBLE 2. CoMPARISONS AMONG THE SUBGROUPS OF GAMBLERS 











U ort 
Measures Direction of difference test df 





Other > Thrill = 2.00 
Security Other > Thrill & Econ 2.45 
Other & Econ > Thrill 1.45 07 
X-0 Thrill & Econ > Other 51 .07 
Distance Thrill > Other .67 .05 

















Social Other > Econ 45 5 | .10 
Responsibility 





Thrill > Econ 1.77 -05 





Thrill > Other 1.56 .08 
Thrill > Econ & Other 1.92 .05 
Thrill & Other > Econ 2.03 < .03 


Femininity 























These results suggest that those we called “thrill ’’gamblers tended to be in- 
secure, felt rather isolated, and tended to show feminine rather than masculine traits. 
Those we labelled ‘“‘economic”’ tended to feel the least responsible, to be more mas- 
culine, dominant, and persistent; they felt most isolated from their fellows. Finally, 
those classified as “‘other” tended to be secure, felt more open and close to others, 
= showed dominant rather than submissive characteristics in their relations with 
others. 

As to the common constellations of traits sought, although it was possible to 
separate gamblers from non-gamblers on the basis of Social Responsibility scores 
alone (p < .05), we could not profitably combine it with any other variable. The 


1No predictions were made for avowed happiness. 





AN EXPLORATORY STUDY OF SOME PERSONALITY CHARACTERISTICS 193 


Security measure discriminated with a p = .07 in our sample, but combination of the 
two did not reach any significance. 


DISCUSSION 


In the present study an attempt was made to investigate the gambling person- 
ality by use of the first phase of scientific study: classification. The present classi- 
fication was a crude one—for example, Bergler“ cites four possible subtypes of the 
neurotic gambler alone. These differences had to be glossed over in our one category 
“thrill”. Yet, in some ways, the classification was an effective one—the presence of a 
significant number of differences among the subgroups of gamblers suggests that the 
groupings are meaningful homogeneous entities. Moreover, the presence of marked 
differences between gamblers and non-gamblers on five of the six variables suggests 
that these traits may be central in gambler-non-gambler differences. 

One must raise the question at this point as to the adequacy of the instruments 
used. For example, the Femininity Scale was included as a possible measure of latent 
homosexuality, said by Bergler“? to be a characteristic of the “thrill” gambler. Our 
results did not wholly substantiate the psychoanalytic idea. Although “thrill” 
gamblers were the ‘“‘most feminine” of the three subgroups of gamblers, they tended 
to be “less feminine’’ than the non-gamblers. This may have been a function of an 
inappropriate instrument. Possibly projective techniques would provide a better 
test. In the same vein, perhaps a better test than Golding’s ‘“X-O Distance’’ could be 
provided to measure psychological isolation. 

Finally, a number of methodological considerations make generalizations from 
this study hazardous. For example, the small sample size, as well as the inflation of 
probabilities involved in the multiple tests (among subgroups of gamblers) throws 
some doubt on the significance of the patterns found. In addition, Harvard under- 
graduates perhaps do not best represent the general population of avid gamblers. 


In future investigations, larger numbers of subjects, more clearly delineated along 
the gambling-non-gambling dimension could clarify the possibilities raised in the 
present: study. 


SUMMARY 


This study was an attempt to adumbrate differential patterns of traits for 
gamblers and non-gamblers, using a classification scheme for gamblers. Harvard 
undergraduates who were avowed gamblers were trichotomized as to motive for 
gambling and compared on measures of security, femininity, dominance, social res- 
ponsibility, avowed happiness and psychological isolation with fellow undergrad- 
uates who were avowed non-gamblers. Some marked differences were found, both 
between gamblers and non-gamblers and among subgroups of gamblers, but differ- 
ential patterns did not reach statistical significance. 
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A CASE OF ANOREXIA NERVOSA IN A 16 YEAR OLD GIRL! 
FRANKLYN N. ARNHOFYF, PH.D. 
Veterans Administration Hospital, Salisbury, N.C. 


INTRODUCTION 


Despite the fact that cases of anorexia nervosa have been reported since 1694“), 
they are relatively uncommon and the syndrome’s status as a clearly defined psy- 
chiatric entity remains disputed“: *». Although it is not our purpose to survey the 
already extensive literature on the topic, and the reader is referred elsewhere for 
more detailed information: ? *. 4: 5 8), certain repeatedly reported findings should 
be considered, as they pertain to the case under discussion. Amenorrhea, sudden 
drastic dieting to overcome previous real or imagined obesity, and family conflicts 
have been consistently reported, with psychological explanations for the syndrome 
emphasizing strong oral regressive traits®: ®), denial of adulthood with its hetero- 
sexual implications®: ®, and fantasies of oral impregnation®: ») being greatly em- 
phasized. 

To the present writer’s knowledge, only one paper, presenting findings on five 
cases, has appeared in the psychological literature” and has served to further 
strengthen the current opinion that while anorexia nervosa may represent a clinical 
entily, it does not appear to be a psychological entity, rather representing typical 
conflict situations that may develop in very different personalities, in various types 
of neuroses. 


CasE REPorT 

Our patient is a 16 year old white girl in the eleventh grade, residing in a rural area of the 
state, who was referred for treatment because of her obsessive fear of food of about 11% years 
duration which resulted in a serious weight loss from approximately 118 pounds to 61 pounds 
upon her admission. Less than three years ago the patient became ill with rheumatic fever and 
was hospitalized for two weeks and bedridden for about eleven months at home. During this 
time she was treated with ACTH. She gained weight during her convalescence and was described 
by her parents as having a “moon face’, and a tremendous appetite, with the patient herself 
stating that the puffiness of her face and the fullness of her abdomen made her less attractive 
following her recovery. She then began to diet drastically and became anxious and apprehensive 
whenever she did eat, lest she become fat and unattractive. 

Her past medical and developmental history were essentially negative, with no rheumatic 
fever residuals noted. Although her menses began about age 12 and were regular, amenorrhea 
was reported for eighteen months prior to her psychiatric hospitalization. Prior to her illness, 
she was described by her parents as being even in temperament and fairly sociable, although 
rather shy and expressing feelings of inferiority. School grades and general adjustment appeared 
good. After recovery from her rheumatic fever and her subsequent return to school, she found it 
difficult to adjust, felt rejected by her friends and was very concerned about her appearance. 

Upon her admission to the hospital, the patient was quite fastidious in her appearance, 
wasteed hav hands and face quite frequently during the day and was described as being hyperactive 
on the ward, despite her gaunt and emaciated appearance, and spent considerable time cleaning 
and straightening the ward. Although her eating was quite erratic for months, after about one 
month of psychotherapy she began to gain weight. Initially, she would not eat for days, then 
gorged herself at other times, particularly with sweets, especially ice cream. When she was seen 
or psychological testing one week after her admission, one of her first remarks to the examiner 
was that she was afraid she had eaten too much ice cream for lunch and might become too fat. 

She was seen in psychotherapy three times per week and established a friendly relationship 
with the therapist quite readily, although she was obsessed with her symptoms and pessimistic 
about her prognosis. She expressed frequent concern about the effects of her illness upon her 
parents and the trouble and inconvenience this caused them. After about two months of therapy 
she seemed to be gaining some insight, was able to verbalize some of her hostile feelings toward 
her parents, particularly the mother, as well as towards her peers and began to see some of the 
secondary gain factors to her illness. Therapy was continued and diminished in frequency to once 


per week. The patient was discharged after six months as improved and is currently maintaining 
a good adjustment. 


Parents. The patient’s parents were seen about once a week by one of the social workers during 
the course of their daughter’s hospitalization in an attempt to clarify, if possible, and change 


1These data were obtained while the author was at the University of Nebraska College of Medicine, 
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some of their attitudes toward their daughter. Both parents are 50 years old and report one 
pregnancy ending in miscarriage prior to the birth of the patient, their only child. The father was 
described as a mild mannered, quiet carpenter who permitted his wife to dominate the inter- 
views in which they were seen together. While both parents expressed and implied considerable 
guilt feelings about the patient’s illness, this was particularly true of the father who stated many 
times that he felt he had not spent enough time with her. It was the mother who was seen fairly 
regularly in collateral therapy, as the father was usually unable for business reasons to make the 
trip in. The mother viewed the sessions as serving only to provide further information about her 
daughter, yet was quite vague and evasive about specific details. She was quite concerned about 
the townspeoples’ reactions to her daughter’s illness, particularly the fact that she was in a psy- 
chiatric hospital. She described her daughter as a “good, obedient child,” toilet trained by age 2 
and always close to her parents. Even as a little girl, she said the patient was always very con- 
cerned with the family finances and took considerable interest in such matters. From the age of 
15, she was described as being very interested in baby clothes and babies and would, frequently, 
on shopping trips stop and look at them and examine them closely. The mother reported that the 
yatient still played with dolls and on one occasion said to her mother that, “being grown up is no 
Sone I had so much fun as a little girl.” 

As could be inferred from the above, the mother was quite resistant to any comments and 
interpretations regarding herself. Her own hostile feelings were quite unacceptable to her and 
when such issue was brought up by the social worker, she stated that she never had angry feelings 
as she does not permit herself to feel that way. From these interviews it was the social worker's 
feeling that there was considerable unconscious mother-daughter rivalry and general over- 

rotectiveness on the part of both parents. 
Pasbeloaiont Data. One week after the patient’s admission to the hospital she was seen by the 
psychologist who administered the Verbal part of the WAIS, Figure Drawing, Rorschach, and 
ten TAT cards. 

Her WAIS verbal IQ of 106 was felt to be an accurate evaluation of her functioning with 
very little intra-, or inter-test variability noted. As found by previous investigators“, there 
were indications, more qualitative than quantitative, of an attempt to overintellectualize situa- 
tions, and yet a tendency to make judgments that were often quite unrealistic and poorly based, 
such as judging the distance from Paris to New York as 25,000 miles and giving the population 
of the United States at about a million. 

The impression of psychosexual immaturity readily apparent from her figure drawings 
(Figure 1) was amply demonstrated on the TAT, where her feelings of fear and of conflict with 


Fig. 1. Figure Drawinas: MALE AND FEMALE 


males was a common denominator in all her themas. Conflict with all human figures was noted 
on all the TAT cards and despite her offtimes childishly unrealistic solutions in which without 
rhyme or reason everything worked out well, positive interpersonal feelings were rarely shown. 
Her Rorschach protocol, although quite productive (58 responses), contained one M to a part 
figure, 8CF(3-), 5FC, 8P, F%71, F+%82, 1YF, and 3FY(1-). For the most part, her responses 
on both these tasks were — brief and rarely expanded or embellished. The following three 


TAT themas are felt to well represent her. 
Card 1: Boy practicing a violin lesson. Didn’t want to so he broke the violin. He hates it 
.... fears punishment. 
Card 13MF: | feel they’re married and....hmmm.... he has hurt her, and he is sorry 


he has done it. If he has hurt her I don’t think it will turn out all right, but if she’s sick I feel 
she’ll be all right. 
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Card 16: I want to imagine me in Colorado. Pretty mountains... . with my folks in the 
trailer house. Going to school. I had lots of friends. Everything went on all right until I 
got married and lived with my husband. 


The overall test impression was that of an adolescent girl who is quite reluctant to give up her 
childhood dependent status and accept an adult role, yet who is quite ambivalent about her par- 
ents and harbors a considerable amount of hostility toward her mother, which can only be ex- 
pressed through devious, unconscious channels. Undoubtedly, one aspect of her unwillingness to 
enter adulthood is her fear of males and heterosexuality. 


Discussion 


A comparison of the present test findings with those referred to previously “” 
reveals some similarities despite the unavailability of the complete test records of 
the previously reported cases, yet our patient seems to be unlike any of the five seen 
by these other investigators, a finding not surprising in view of their findings of the 
marked dissimilarity of their patients. 

While psychological test results on a total of six patients certainly cannot be 
construed as a conclusive or a representative sample of patients in this or any diag- 
nostic group, the dissimilarity of test findings and the similarity of life situations and 
conflict areas is certainly not unique. The volumes of research that have appeared 
in the past few years have demonstrated the inability to differentiate specific pat- 
terns or signs that are mutually exclusive for different diagnostic groups, and the 
dynamic configurations and conflict situations we find in anorexia nervosa, for ex- 
ample, are certainly not distinctive. Similarity rather than dissimilarity in the test 
productions would be the surprising findings, so that anorexia nervosa would not be 
expected to be, and has not been demonstrated to be a psychological entity despite 
the marked similarity in the disease syndrome. 


SUMMARY 


A case history of anorexia nervosa with psychological test findings was pre- 
sented and compared with previously reported test findings of five other cases. While 
the clinical syndrome is consistent with the literature, the psychological tests were 
as dissimilar from the previously reported five cases as they were from each other, 
giving further evidence for the contention that anorexia nervosa does not represent a 
psychological entity. 
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AN ALPHABETICAL LIST OF MMPI ITEMS 


J. A. MORRIS KIMBER 


Associated Psychological Centers of Southern California 
Los Angeles 


With the Minnesota Multiphasic Personality Inventory appearing in two forms, 
the individual and the group, and with different arrangement and numberings of the 
items on the different forms, a conversion table“) makes possible the identification 
of an item on one form in terms of the other form. Such a conversion table makes no 
provision for the location of an item whose number is not known. Yet from time to 
time the literature affords examples of mention of an item or gives lists of items with- 
out furnishing the reader with any identifying number or numbers. 

In order that every item of the MMPI could be quickly identified in terms of a 
number, the following alphabetical listing was prepared. The words which ordinarily 
appear are the first ones which are different from the words of any and all other 
items. In some cases only one word is needed. For example, in the following list-it 
will be seen that there is an item which shows only the word ‘‘Children.”’ (Group 199, 
Individual D-6). The word “children” is sufficient since there is only one item be- 
ginning with the word “children”. However, in the following list a few extra words 
have sometimes been added if space permitted in order that the reader might know 
at a glance that he was locating the item which he intended to locate. 


MINNESOTA MULTIPHASIC PERSONALITY INVENTORY 
Items in alphabetical order. 


Ind. Grou Ind. Group 
(Man.) (Booklet) (Man.) (Booklet) 


A large number D3 558 Dirt HA2 510 
A minister D-21 53 During one D32 38,311 
A person should B-37 11 During the A-1 153 
A person shouldn’t E-8 Even when F-35 305, 366 
A windstorm Everything is D-22 58 


Almost every day Everything tastes G-11 
Any man Evil spirits H-25 
As a youngster Except by B-39 
At one or more times Horses that 1-45 

At parties I almost never dream B-33 


At periods I am a good mixer E-36 
At times I am I am a high-strung 1-33 

At times I feel like picking I am a special D-24 
At times I feel like smashing I am about as C-17 
At times I feel like swearing I am afraid of being H-A8 


At times I feel that I am afraid of finding H-50 
At times I have a strong ’ I am afraid of losing H-55 
At times I have been I am afraid of usin H-53 
At times I have enjoyed I am afraid to be alone H-46 
At times I have fits of I am afraid when HA7 


At times I have very I am against D-51 
At times I have worn I am almost never B-9 
At times I hear so well I am always disgusted 

At times I think I am an important H-19 
At times it has been I am apt to hide F-27 


At times my thoughts have I am apt to pass . . because 
Bad words I am apt to pass. . when 
Children I am apt to take 

Christ I am attracted 

Criticism I am bothered by acid 





I am bothered by people 
I am certainly lackin 

I am easily Sockenet 

I am easily downed 

I am easily embarrassed 


I am embarrassed by 
I am entirely 

I am fascinated 

I am greatly bothered 
I am happy 


I am in just as good 
I am inclined 

I am liked 

I am likely 

I am made 


I am more 

I am neither 

I am never 

I am not afraid of fire 
I am not afraid of mice 


I am not afraid of picking 
I am not afraid to Cadi 
I am not bothered 

T am not easily 

I am not unusually 


I am often afraid of 

I am often afraid that 
T am often inclined 

I am often said 

I am often so annoyed 


I am often sorry 
I am quite 

I am so touchy 
I am sure I am 
I am sure I get 


I am troubled by attacks 

I am troubled by discomfort 
I am usually calm 

I am very careful 

I am very religious 


I am very seldom 

I am very strongly 

I am worried 

I believe I am a 

I believe I am being fol. 


I believe I am being plot. 
I believe I am no more 

I believe in a life 

I believe in law 

I believe in the 


I believe my sense 

I believe my sins 

I believe that a 

I believe that my home life 
I believe there is a Devil 
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d. 


In Group 
(Man.) (Booklet) 


H-1 
I-39 
B-31 
1-26 
FA4 


C-55 
140 

G-46 
A-28 
G-12 


A-2 
1-32 
E.39 
E-23 
H-37 


F-10 317, 
AA 

5-21 
H-36 
H-39 


H-44 
H-33 
B-14 
G-28 
F-6 


J-36 

A-53 
D-40 
G-27 
F-24 


F-32 
E-26 
F-18 
H-3 

G-55 


A-15 
B-17 
F-37 
J-14 
D-8 


B-18 


448 
86 
5 
82 
321 


I believe there is a God 
I believe women 

I blush 

I brood 

I can be friendly 


I can easily 

I can read 

I can remember 
I can sleep 

I can stand 


T cannot do 

I cannot kee 

T cannot understand 
I certainly feel 

I commonly hear 


I commonly wonder 

I could be 

I cry 

I daydream very little 
I deserve 


I dislike having 
I dislike to 
I do many 
I do not always 
I do not blame 


I do not dread 

I do not have a 

I do not have spells 

I do not like everyone 
I do not like to 


I do not mind being 

I do not mind meeting 
I do not often 

I do not read 

I do not tire 


I do not try to correct 
I do not try to cover 
I do not worry 

I don’t blame 

I don’t seem 


T dread 
I dream frequently. 


I dream frequently about 


I drink 
I easily become 


I enjoy a race 

I enjoy children 
I enjoy detective 
I enjoy gambling 
I enjoy many 


I enjoy reading 
I enjoy social 

I enjoy stories 
I enjoy the 

I feel anxiety 





Ind. Group 
(Man.) (Booklet) 


D-15 
D-7 

A-51 
F-45 
E-54 


I-1 
A-32 
I-21 
B-32 
J-16 


F-40 


258 
101 
528 
236 


253 


269 
188 
481 
211 
532 


517 
335 
159 
142 
184 


454 
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I feel hungry 

I feel like giving up 
I feel like jumping 
I feel sure 

I feel sympathetic 


I feel that I 
I feel that it 
I feel tired 

I feel unable 
I feel uneasy 


I feel weak 

I find it hard to keep 
I find it hard to make 
I find it hard to set 

I forget 


I frequently ask 

I frequently find it 

I frequently find myself 
I frequently have to 

I frequently notice 


I get all 

I get angry 

I get anxious 
I get mad 

I go to church 


I gossi 

I ow ever feel 

I hardly ever notice 
I hate 

I have a cough 


I have a daydream 
T have a good 
I have a great 
I have a habit 
I have at times had 


I have at times stood 

I have been afraid 

I have been disappointed 
I have been inspired 

I have been quite 


I have been told 

I have certainly had 

T have diarrhea 

I have difficulty 

I have felt embarrassed 


I have few 

I have frequently 

I have had attacks 

T have had blank 

I have had no dif. keeping 


I have had no dif. st. bowel 
I have had no dif. st. urine 

I have had periods . . carried 
I have had periods . . lost 

I have had periods of days 


I have had periods when 
I have had some 

I have had very 

T have little 

I have met problems 


B-13 
FA1 
1-55 

D-13 
F-22 


D-33 
E-3 
J-13 
F-17 
H-49 


A-40 
I-27 
E-4 
C-28 
G-35 


I-14 
D-39 
F-46 
F-9 
A-45 


E-40 
J-43 

H-28 
G-29 
D-10 


J-A8 
A-50 
B-10 
C-2 

B-7 

F-16 
B-12 
B-15 


424 
487 
565 
373 
489 


157 

26 
544 
384 
365 


189 


, 328 


180 
461 
342 


394 
112 
217 
172 


306 
75 
351 
234 
95 


225 
68 
230 
370 
34 


I have more trouble I-23 356 
I have never been in love C-11 324 
I have never been in trouble (sex) E-17 37, 302 
I have never been in trouble (law) E-12 294 
I have never been made E-14 478 


I have never been paralyzed A-41 330 
I have never done D-35 111 
T have never felt A3 160 
I have never had a faint A-17 174 
I have never had a fit A-18 154 


I have never had any black B-21 542 
I have never had any break’g : 214 
I have never indulged D- 13: 
I have never noticed 2 486 
I have never seen a vision : 464 


I have never seen things 496 
I have never vomited 

I have nightmares 

I have no dread 

I have no enemies 


I have no fear of spiders 
I have no fear of water 
I have no patience 

I have no trouble 

I have not lived 


I have numbness 

I have often been 

I have often fe't badly 
I have often fe't guilty 
I have often felt that 


I have often found people 
I have often had to 

I have often lost out 

I have often met people 

I have often wished 


I have one or more bad 

I have one or more faults 
I have periods in which 

I have periods of such 

I have reason for feeling 


I have several times given 
I have several times had 
I have sometimes felt 

I have sometimes stayed 
I have strange 


IT have strong 

I have the wanderlust 

I have to urinate 

I have used alcohol excess. 
I have used alcohol mod. 


I have very few fears 

I have very few headaches 
I have very few quarrels 

I hear strange things 

I know who 


I like adventure 
I like collecting 
I like dramatics 
I like mannish women 
I like mechanics 





(Man.) (Booklet) 


I like movie love 

I like or have liked 
I like parties 

I like poetry 

I like repairing 


I like science 

I like tall women 
I like to attend 

I like to be with 
I like to cook 


I like to flirt 
I like to go 

I like to keep 
I like to know 
I like to let 


I like to poke 

I like to read about history 

I like to read about science 

I like to read newspaper art. 
I like to read newspaper ed. 


I like to study 
I like to talk about sex 
1 like to visit 


Ind. 


J-27 
I-47 
J-33 
J-18 
1-46 


1-50 
J-22 
C40 
E-33 
J-11 


C48 
E-34 
G-17 
JA6 

F-20 


H 
I liked “Alice in Wonderland” 


I liked school 


I love to go 

I loved my father 
I loved my mother 
I must admit 

I never attend 


I never worry 

I often feel 

I often memorize 
I often must 

I often think 


I played hooky 

: practically never 
Tee 

I prefer to pass 

I prefer work 


T read in the Bible 
I readily become 

I refuse 

I resent 

I see things 


I seem to be 

I seem to make 

I seldom or never 
I seldom worry 

I should like 


I shrink 

I sometimes feel 
I sometimes find 
I sometimes keep 
I sometimes tease 


I strongly defend 

I sweat 

I tend to be interested 

I tend to be on my guard 
I think a great 
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Group 


566 
423 
547 

78 
550 
221 
441 
429 
254 
140 


208 


Group 


Ind. 
(Man.) (Booklet) 


I think I would . 
I think I would . 
I think I would . 
I think I would. 
I think Lincoln 


. forest r. 

. building 

. dressmaker 
. librarian 


I think most people 

I think nearly anyone 
I think that 

I try to 

I used to have 


I used to keep 

I used to like drop-the-h. 
I used to like hopscotch 
I usually expect 

I usually feel 


T usually have to 

I usually “lay my cards .. ” 
I usually work 

I very much like horseback 
I very much like hunting 


I very seldom 

I wake up fresh 
I was a slow 

I was fond of 

I wish I could be 


I wish I could get 

I wish I were not bothered 

I wish I were not so shy 

I work under a great deal 

I worry over money and bus. 


I worry quite a bit 

I would certainly 

I would like to be a florist 

I would like to be a journal. 
I would like to be a nurse 


I would like to be a p. secy. 
I would like to be a singer 

I would like to be a soldier 
I would like to be an auto r. 
I would like to hunt lions 


I would like to wear 
I would rather win 
: given the chance I could 
4 peer the chance I would 
could get into a 


If I were a reporter . . theater 
If I were a reporter . 
If I were an artist . 
If I were an artist . 


If I were in trouble 


. child’n 
. flower 


If people had not 

If several people 

In a group of people 
In my home 

In school I found 


In school I was 

In school my marks 
In walking 

It bothers 


It does not bother . . suffer 


149 
J-7 

1-42 
J-34 
J-17 


D-52 
D-53 
A-9 
E47 
J-21 


J-25 
1-45 
J-37 
C-29 
F-38 


G-36 
C-32 
C-26 
J-28 
I-51 


F48 
B-27 
C-36 
D-31 
F-36 


I-17 
C-53 
F5 
1-34 
C-18 


1-54 

E-10 
1-53 

J-23 

J-1 


J-8 
J-32 
J-10 
J-2 
I-52 


J-35 
J-55 
H-18 
1-28 
J-45 


J-19 


. Sporting J-29 


J-20 
I-44 
E-5 


G-53 
E-1 
F-2 
B46 
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It does not bother .. looking E-49 
It is all right E6 

It is always F-19 
It is great I-11 
It is not hard 


It is safer 

It is unusual 

It makes me an 

It makes me fee 

It makes me impatient 


It makes me nervous 

It makes me uncomfortable 
It takes a lot of argument 
It would be better 

It wouldn’t make me 


Life is a strain 
Lightning 

Many of my dreams 
Most any time 
Most nights 


Most of the time I feel blue 
Most of the time I wish 
Most people are honest 
Most people inwardly dislike 
Most people make friends 


Most people will use 

Much of the time I feel 
Much of the time my head 
My conduct is 

My daily life 


My eyesight 
My face 
My family 
My father 
My feelings 


My hands and feet 
My hands have not 
My hardest battles 
My hearing 

My judgment 


My memory 

My mother or father 

My mother was a good woman 
My mouth 

My neck 


My parents and family 
My parents have 

My people 

My plans 

My relatives 


My sex life 
My skin 
My slee 
My sou 
My speech 


My table manners 

My way of doing things 
My worries 

No one cares 

No one seems 


262 
437 
498 
403 
222 


265 
503 
536 


Often, even though 

Often I can’t understand 
Often I cross 

Often I feel 

Once a week . . . very excited 


Once a week without apparent 
Once in a while I feel hate 
Once in a while I laugh 

Once in a while I put off 

Once in a while I think 


One or more members 
Parts of my 

Peculiar odors 

People can pretty easily 
People generally demand 


People have often 

People often disappoint me 
People say 

Policemen 

Religion 


Several times a week 

Several times I have been 

Sexual things disgust me C-54 
Some of my family have habits C3 
Some of my family have quick C-2 


Some people are G-50 
Someone has been trying toin. H-6 
Someone has been trying to po. H-14 
Someone has been trying torob H-15 
Someone has waned H-7 


Someone has it 
Something exciting 
Sometimes at elections 
Sometimes I am strongly 
Sometimes I am sure 


Sometimes I become 
Sometimes I enjoy 
Sometimes I feel 
Sometimes I have 
Sometimes my voice 


Sometimes some unimportant 
Sometimes, when embarrassed 
Sometimes when I am not 
Sometimes without 

The future is too 


The future seems 

The man who had 
The man who provides 
The members of my 
The one to whom 


The only interesting 
The only miracles 
The sight of blood 
The things that some 
The top of my head 


There are certain people 

There are persons 

There is something wrong . mindA-25 
There is something wrong. sex B-26 
There is very little love BA 
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Ind. Group Ind. Group 
(Man.) (Booklet) (Man.) (Booklet) 
There never was a time J-31 300 When I get bored E-31 
There seems to be a fullness A-13 108 When I leave home G-28 
There seems to be a lump B-11 10 When I take a new job E-16 
These days I find it 84 When I was a child, I didn’t E-19 
Usually I would 435 When I was achild, I belonged D-34 


What others think 170 When in a group E43 
When a man 485 When someone does E-9 

When I am cornered 475 When someone says G49 
When I am feeling 375 Whenever possible H-51 
When I am with 48 While in trains E-55 
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RELATIONSHIP BETWEEN INTELLIGENCE AND FRUSTRATION- 
AGGRESSION PATTERNS AS SHOWN BY TWO RACIAL GROUPS 


J. L. MC CARY AND JACK TRACKTIR 
University of Houston Baylor University College of Medicine 


PROBLEM 


Frustration as a personality variable and intelligence as a cognitive variable are 
important concepts in understanding man’s reactions to stress. Much work“: ¢: ©) 
has been done in an attempt to understand factors operative in frustration-aggression 
patterns, although the relationship which intelligence has to the patterns has been 
neglected far too frequently. Studies®: * 4 5) have shown the importance of age, 
sex, race, occupation and geographic habitat in measuring aggressive reactions to 
frustration. The purpose of this study was to determine the relationship between 
intelligence and frustration-aggression patterns in a sample of northern Negro and 
white high school students. 


METHOD 


Two tests, the Rosenzweig Picture-Frustration Study and the Otis Quick- 
Scoring Test of Mental Ability, Gamma, Form B, were administered in a Pittsburgh, 
Pennsylvania, high school to 275 students: 108 white boys, 80 white girls, 57 Negro 
boys and 30 Negro girls from middle, middle class families. Ages ranged from 14 to 
22 years. McCary®? had previously investigated the same subjects on type and 
direction of reactions to frustration and while, in general, he did not find significant 
differences in aggressive reactions between northern Negroes and northern whites, 
he did report that the northern Negro females showed more Obstacle-Dominance, 
and less Need-Persistence, than did the northern Negro males. The northern white 
females exceeded the northern Negro females in Intrapunitiveness and Need- 
Persistence, while the opposite was true with regard to Obstacle-Dominance. The 
northern white males showed more Intrapunitiveness than did the northern Negro 
males. No other significant differences among the eight sub-groups were noted. This 
work, however, failed to consider intellectual differences in reactions to frustration. 
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A 
RESULTS 


It will be noted in Table 1 that there was a significant difference in intelligence 
between the Negro and white groups, regardless of sex, whereas there were no sig- 
nificant differences within the racial groups. The groups were divided into three 
levels of intelligence, upper one-third, middle one-third, and lower one-third, and 


compared on differences in aggressive reactions tc frustration as measured by the 
P-F Study. 


TasBiLe 1. Means, Stanparp Deviations AND CriticaL Ratios oF MEAN 
DIFFERENCES IN INTELLIGENCE BETWEEN RactaL Groups AND SEXES 





Mean Critical Standard 
Group N IQ Ratio Deviation 





White Male 108 104.8 -20 10.9 
White Female 80 104.5 10.8 


Negro Male 57 ; 1.13 10.4 
Negro Female 30 ; 12.2 


White Male 108 d 4.60* 10.9 
Negro Male 57 i 10.4 


Negro Female 30 : 12.2 
White Female 80 104. : 10.8 





‘Significant at the 1% level of confidence. 


The white subjects were consistently and significantly higher on all three levels 
of intelligence than were the Negro subjects, although only one difference was noted 
when sex differences were compared for the same races. In this comparison, the low 
intellectual Negro males were superior to the low intellectual Negro females. Table 
2 shows the aggressive reactions, in terms of the P-F classifications, of the Negro and 
white, male and female, high, middle and low intelligence groups. Only the differ- 
ences which were significant are presented in the table. 


Tasie 2. Means, STANDARD DEVIATIONS AND CriTICAL Ratios oF MEAN DIFFERENCES (SIGNIFICANT 
at 5% Levee. or CONFIDENCE) ON THE P-F Scorses or Hicu, Low, anp MrppiE IQ Necro 
AND WHITE, MALES AND FEMALES 





P-F Critical Standard 
Classification Group Mean Ratios Deviation 





E Low IQ Negro Males (19) : 2.47* 
Low IQ White Males (36) : 


Low IQ Negro Males (19) , 2.02* 
Low IQ White Males (36) : 


M None 


. 


0-D High IQ Negro Females (10) 
High IQ White Females (26) 


E-D None 


N-P Middle IQ Negro Females (10) 
Middle IQ White Females (28) 


GCR Low IQ Negro Males (19) 
Low IQ White Males (36) 





*Significant at the 5% level of confidence. 
**Significant at the 1% level of confidence. 
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The low IQ Negro males exceeded the low IQ white males on Extrapunitiveness 
(E), while the opposite of this was true for those groups on Intrapunitiveness (I) and 
Group Conformity Ratio (GCR). The high IQ Negro females scored more on Ob- 
stacle-Dominance (O-D) than did their white counterparts, and the middle IQ white 
females exceeded the middle IQ Negro females on Need-Persistence (N-P). The 
importance of the role of intelligence levels in aggressive reactions can be seen by 
comparing the present findings with those of McCary’s study ®?, in which the same 
subjects were not sub-divided according to intellectual levels. 


SUMMARY AND CONCLUSIONS 


The Rosenzweig Picture-Frustration Study and the Otis Quick-Scoring Test of 
Mental Ability, Gamma, Form B, were administered to northern Negro and white, 
male and female high school students. Reactions to frustration of these subjects had 
previously been studied but without taking into consideration their intellectual levels. 
The subjects were divided into high, middle, and low intellectual groups and the 
sub-groups of the two racial groups were compared to find the relationship between 
frustration-aggression patterns and intelligence. The white males and females were 
significantly higher than the Negro males and females on intelligence scores at all 
three intelligence levels. One other difference in intellect was noted in that the low 
intellectual Negro males were significantly higher than were the low intellectual 
Negro females. 

The aggressive reactions to frustration of the racial groups at each of the levels 
of intelligence showed, in general, that there was no consistent relationship obtained 
between intelligence and frustration-aggressive patterns. However, some trends 
were evidenced. There were no significant differences in aggressive reactions be- 
tween high IQ Negro and white males. The low IQ Negro males were more overtly 
aggressive, whereas the low IQ white males were more self-blaming and conformed 
more closely to the group. No significant differences in aggressive reactions between 
Negro and white males were noted at the middle IQ level. 

These findings indicate the importance of considering intellectual ability along 
with other multi-factor influences before establishing normative data which are 
expected to apply equally well to more than one specific group of subjects. 
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EDITORIAL OPINION 





CLINICAL “HARD-HEADEDNESS” os. SCIENTIFIC “CRITICALITY” 


This editorial was stimulated by the attitude of a psychiatric colleague who is a 
board member of a prominent psychiatric journal and believes that psychiatrists are 
more “‘rock-ribbed, hard-shelled and hard-boiled”’ than clinical psychologists. While 
prototypical categorizations of members of different professions are completely in- 
valid unless significant sampling or experiential differences can be demonstrated, 
empiric observations suggest that differences in training leading to M. D. or Ph.D. 
degrees may underly the different behavior traits of ‘clinical hardheadedness” in the 
psychiatrist vs. the “scientific criticality” of clinical psychologists to the degree to 
which such differences actually exist. Such differences appear to relate in part to 
differing objectives of medical education and graduate training in clinical psychology. 
Medical education has basic science foundations intended to instill scientific atti- 
tudes but to a large degree its methods are operationally designed to teach students 
how to make and execute clinical decisions. The role of the physician is openly con- 
ceived as being actively and directively diagnostic and therapeutic. Such training 
results in clinicians who are self-confident and “‘hard-headed”’ in their clinical judg- 
ments, and produces aggressively dominant behavior traits directed towards patients 
and colleagues. Graduate training in clinical psychology is much more research- 
oriented with students being taught to think experimentally and statistically, and 
with training in making active or directive clinical judgments being minimized or 
even rejected in nondirective training centers. 

What evidence can be gathered in support of such contentions? One approach 
would be to make an analysis of the contents of prominent psychiatrie and psycho- 
logical journals in order to compare the interests and types of scientific productivity 
of the two professions. To this end, we report below a comparison of the contents of 
three psychiatric journals (American Journal of Psychiatry, American Journal of 
Orthopsychiatry and the American Journal of' Psychotherapy) and three psychological 
journals (Journal of Abnormal and Social Psychology, Journal of Consulting Psy- 
chology, and the Journal of Clinical Psychology) based on a tabulation of the number 
and type of articles published in four comparable issues of each journal in 1956.* A 
rough estimation of the degree of scientific sophistication of the individual articles 
was achieved by classifying them into the following categories: 


I. Theoretical-speculative. The argument represents only the author’s personal opinions un- 
supported by any formal scientific collection or analysis of data. Usually, the weight of the 
article is supported by references to authorities or other sources, and represents only validation 
by popular acclamation. 


II. Historical-reviews. Compilations of the opinions or results by various authorities or scientific 
reports; also, eulogies of authorities, 


III. Clinical Studies. Case histories or collections of data by the anecdotal method and with no 
statistical or experimental analysis, and interpreted intuitively. 


IV. Experimental-statistical. Basic research studies classified at two levels of sophistication of 
objective analysis of data involving (a) elementary analysis in terms of simple frequency dis- 
tributions, means and standard deviations, or (b) more advanced analysis of the significance of 
differences by frequency comparisons using Chi square or simple, complex or covariance an- 
alysis of variance methods, ‘ 


eel 
4 


While immediately admitting the difficulty and invalidity of drawing conclusions on 
such tenuous data known to be influenced by such extraneous factors as editorial 
policies and biases, the data in Table 1 are interesting as indicating rough trends of 
professional interest, methodological sophistication and publication policies between 
the two professions. 
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Tass 1. TABULATION OF CoNTENTS OF THREE PsYCHIATRIC AND THREE PsYCHOLOGICAL JOURNALS 


BasEp ON Four CoMPARABLE 1956 Issuzs or Eacu JouRNAL 








Journals 


Number of 
Articles 
and Pages 


Theoretical Historical Clinical E 


Type of Content 


rimental-Statistical 
imple Advanced 





Psychiatric 
American Journal 
of Psychiatry 
American Journal of 
Orthopsychiatry 
American Journal of 
Psychotherapy 


Totals 


59 
(305 pages) 
49 
(753 pages) 
32 
(393 pages) 


10 (16%) 
31 (63%) 
16 (50%) 


9 (15%) 
1 (2%) 
4 (12%) 


15 (25%) 
7 (14%) 
11 (83%) 


14 (28%) 11 (18%) 
5 (10%) 5 (10%) 
1 (3%) 





140 


57 (40%) 


14 (10%) 


33 (23%) 


20 (14%) 16 (11%) 





Psychological 
Journal of Abnormal 
and Social Psychology 
Journal of Consulting 


Psy 
Journal of Clinical 
Psychology 


Totals 


3 (3%) 
3 (4%) 
7 (7%) 


4 (4%) | 4 (4%) 93 (93%) 
75 (96%) 


83 (87%) 


104 
(528 pages) 


(312 pages) 
95 
(379 pages) 
277 


1(1%) | 4 (4%) 

















(13 6%) | 


5 (2%) | 8 8%) 251 (90%) 








Recognizing that this type of analysis has only suggestive evidential value, but 
keeping also in mind that probably each editor published the best material available 
to him, we note that of 140 articles in the psychiatric journals, 104 (73%) dealt with 
speculative-theoretical, interpretive or anecdotal subject matters while only 36 
(25%) were experimental-statistical studies and of the latter only 16 (11%) utilized 
advanced methods of analysis. In contrast, 90% of the 277 articles in the three psy- 
chological journals were experimental-statistical studies utilizing advanced method- 
ology. A purely impressionistic evaluation of many of the speculative-theoretical 
articles in the psychiatric journals indicates that the content of the articles repre- 
sented only personal beliefs and intuitive clinical judgments on a very low level 
of scientific evidential value. This is particularly true of psychoanalytic journals 
which were not included in this analysis because their contents are currently almost 
100% theoretical-speculative. 

An interesting technical comparison of editorial policies in psychological vs. 
psychiatric journals is found in the fact that the three psychiatric journals utilized 
1451 pages for 140 articles (10.4 pp. per article) while the three psychological journals 
published 277 articles in 1219 pages or only 4.4 pp. per article. This seems to indicate 
that not only is experimental-statistical reporting more conclusive evidentially but 
also that it is more economical in terms of publication facilities and easier on readers. 


*The January, April, July and October issues (or the closest to these dates) with the exception 
of the February issue of the American Journal of Psychiatry selected because the January issue con- 
sisted largely of reviews of progress in fields. The American Journal of Orthopsychiatry is, perhaps, not 
exactly comparable because it deliberately allots many pages to symposia (which are, however, 
largely theoretical-speculative). 
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We su, G. 8. and Dantstrom, W. G. Basic Readings on the MMPI in Psychology 


and Medicine. Minneapolis: University of Minnesota Press, 1956, pp. 656. 
$8.75 


This important reference work includes 66 of the most important articles and 
almost 700 references which have appeared on the MMPI during the last 15 years. 


Feric., H. and Scriven, M. (Eds.) The Foundation of Science and the Concepts of 


Psychology and Psychoanalysis. Minneapolis: University of Minnesota Press, 
1956, pp. 346. $5.00 


This is the first volume in a series entitled ‘‘Minnesota Studies in the Philosophy 
of Science’’. It includes twelve stimulating essays on problems of science with specific 
reference to psychology and psychoanalysis. In an era when some of us tend to be 
concerned too exclusively with objective data, it is valuable to contemplate some of 
the larger problems both within and beyond the boundaries of science. 


Davin, H. P. and Von Bracken, H. Perspectives in Personality Theory. New York: 
Basic Books, 1957, pp. 435. $6.50 


Published under the auspices of the Internationa] Union of Scientific Psychology, 
22 leading authorities cooperate in presenting a survey of current trends in person- 
ality theory and research. Six chapters discuss trends in American and European 
psychology. Eight chapters discuss developments in personality theory, three chap- 
ters report methodological problems, and three chapters present commentaries. 


Asramson, H. A. The Patient Speaks. New York: Vantage Press, 1957, pp. 239. 
$3.50 


This book presents excerpts from psychoanalytic ‘‘ego-reconstruction” therapy 
of a patient with chronic eczema. Even though editing has resulted in a nontechnical 
presentation with much being omitted, the remaining verbatim excerpts provide 
source materials concerning what goes on in psychoanalytic therapy. 


Eisster, R. S. et al. (Eds.) The Psychoanalytic Study of the Child. Vol. XI. New 
York: International Universities Press, 1956, pp. 470. $8.50 


Hayes, E. N. (Ed.) Directory for Exceptional Children. Boston: Porter Sargent, 
1956, pp. 247. Second edition. 


ScHAFFNER, B. Group Processes. New York: Josiah Macy, Jr. Foundation, 1956, pp. 
255. $3.50. Proceedings of the second conference on Group Processes sponsored 
by the Josiah Macy, Jr. Foundation. 


O’Connor, N. and Tizarp, J. The Social Problem of Mental Deficiency. New York: 
Pergamon Press, 1956, pp. 182. $5.00 


This monograph summarizes research carried out since 1947 by the British 
Medical Research Council’s Social Psychiatry Research Unit. 


Wernoren, H. The Urge to Punish. New York: Farrar, Straus and Cudahy, 1956, 


pp. 213. $4.00. New approaches to the problem of mental irresponsibility for 
crime. 


Sparer, P. J. (Ed.) Personality, Stress and Tuberculosis. New York: International 
Universities Press, 1956, pp. 629. $12.50 
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ScHONELL, F. E. Educating Spastic Children. New York: Philosophical Library, 
1956, pp. 242. $6.00 


EIsEnstTEIN, V. W. (Ed.) Neurotic Interaction in Marriage. New York: Basic Books, 


1956, pp. 352. $5.50. Authoritative papers on the situational psychology of 
marriage. 


Jones, M. R. (Ed.) Nebraska Symposium on Motivation. Lincoln, Neb.: University 
of Nebraska Press, 1956, pp. 311. $3.00 (paper) ; $3.50 (cloth) 


Krovut, M. H. (Ed.) Psychology, Psychiatry and the Public Interest. Minneapolis: 
University of Minnesota Press, 1956, pp. 217. $4.00. Nineteen authorities repre- 
senting the fields of psychology and psychiatry present statements concerning 
their opinions concerning the relations of the two fields. 


Oprer, C. Anziety and Magic Thinking. New York: International Universities 
Press, 1956, p. 302. $5.00 


Sterner, F. Taboo. New York: Philosophical Library, 1956, pp. 154. $4.75 








ReaW. | AN IMPORTANT NEW BOOK 





MENTAL DEFICIENCY 


IN RELATION TO PROBLEMS OF GENESIS, SOCIAL AND 
OCCUPATIONAL CONSEQUENCES, UTILIZATION, 
CONTROL AND PREVENTION 


by 
J. E. WALLACE WALLIN, Ph.D. 


This important book summarizes the author’s life-long 
experience in dealing with practical social problems relating to 
the control of mental defectives. Dr. Wallin bases his authorita- 
tive opinions upon an intensive review of the literature in this 
field and his conclusions reflect the latest scientific information. 
This book differs from other works on mental deficiency in that 
it deals with many practical issues for which society urgently 
demands some answers. It should be read by all psychiatrists, 
psychologists, social workers, educators, parents and other 
workers in the field. 


PRICE: $5.00 - - - ORDER NOW 
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THE HOWARD INK BLOT TEST 
A NEW PROJECTIVE TECHNIQUE 


Derived experimentally by James W. Howard, Ph.D. 

An individual test consisting of 12 plates 814” x 11”. The blots are 
somewhat larger than the Rorschach and with a wider range of 
stimulating features. Blots of varied colors selected from an experi- 
mental sample of 10,000 blots. 

Not intended to be a parallel test to the Rorschach. Statistical find- 
ings from a ‘“‘normal’’ group show the test to differ markedly from 
the Rorschach in ways which should indicate greater sensitivity. 
Selection and arrangement of blots made on the basis of evidence of 
character, degree and spread of stimulation. 

Data from normative sample indicate more frequent light deter- 
mined responses, higher incidence of color responses with different 
C, CF and FC relationships, more movement responses and fewer 
animal responses. Greater diagnostic sensitivity with wider range 
of response. 


Price: $12.50 (rp cre os a otra) 


HOWARD INK BLOT TEST MANUAL 


MONOGRAPH SUPPLEMENT NO. 10 FROM JULY 1953 ISSUE 
Price $2.00 
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ACCEPTANCE 


THE HEART OF THE DEVEREUX 
Schools’ technique is ‘‘accep- 
tance”’ of each individual child, 
whatever his functional level of 
performance or his present de- 
gree of maturity. He comes to 
feel that the professional staff 
members with whom he comes 
into contact have faith in his 
capacity for growth. 


To assist in this the staff pro- 
vides the student with a wide 
range of therapies — medical or 
psychiatric treatment, psycho- 
logical counseling, or psycho- 
analysis when indicated. The 
same staff evaluates every boy 
or girl on admission, in order to 
determine individual needs and 
to insure proper placement in 
whichever of the score of home- 
school units that is best suited 
to his needs. 


Professional inquiries should be addressed to John M. Barclay, Director of Development, 
or Charles J. Fowler, Registrar, Devereux Schools, Devon, Pa. For the western states, 


address Joseph F. Smith, Superintendent, or Keith A. Seaton, Registrar, Devereux 
Schools, Santa Barbara, Calif. 
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